[PYTHON] I want to detect unauthorized login to facebook with Jubatus (1)

Last time, I made not too useful classifier, so I will make something that can be used a little more.

facebook data download

On facebook, click the "▼" mark in the upper right (as of July 2014), When you display the "Settings" item, there is a link "Download Facebook data" at the bottom of "General account settings". If you click this and perform various authentications, you can download the data you have posted on facebook in htm format. I tried to download this with a light feeling, but honestly there was too much critical information and I pulled it a little. I feel that if we make a recommender using such data, we will be able to create something with high accuracy.

security

So, in this story, there is an item called "security" in the data downloaded above, Since you can see who logged in to this account from where on facebook, etc. This is used to detect unauthorized access.

I still use Jubatus. There is an API called jubaanomaly for detecting outliers, so use this.

configration

The Jubatus server configuration file is here. https://github.com/chase0213/anomal_facebook_activity/blob/master/lof.json

Since it is basically a sample as it is, there is no particular explanation. It is also a good point of Jubatus that you can make something that can be used to some extent even if you use the sample as it is.

This time it's "Jubatus de ~ (1)", so when I publish (2), I think I've tuned around here properly.

pre-processing

The data downloaded from facebook is marked up in htm format, so it will be transformed accordingly. This time, I created a python script and transformed it (python 2.6). https://github.com/chase0213/anomal_facebook_activity/blob/master/data/trim.py

(You can write something like string_rules, but I needed to format it a little more)

anomaly detection

When the data is ready, pass it to the Jubatus server for the anomaly calculation.

client = jubatus.Anomaly(HOST,PORT,NAME)

Start the Jubatus server as

ret = client.add(datum)

Add the data as. datum is data in jubatus.common.Datum format. Then ret will return the id and anomaly of that data. At first, it is better to look at the return value of this client.add (Datum) and see that the data is properly distributed around 1.0. If any of them are clearly far from 1.0, take a look at the data. There is a possibility of unauthorized access.

So, once you have some data stored on the Jubatus server, I will give you data that is completely different from usual.

anomal_datum = Datum({
    "activity": "DELETE",
    "time":     "July 15, 2014 17:59 UTC+12",
    "ip_address": "127.0.0.1",
    "brawser": "IE6",
    "cookie": "???"
})

After defining the data, let the Jubatus server calculate the anomaly. Here, we want to see if it is abnormal without registering the data, so we use client.calc_score instead of client.add.

anomality = client.calc_score(anomal_datum)

calc_score has a float value as a return value, so please like it whether you boil it or bake it.

I wish I could show "ordinary data", but I will omit it because I will increase anomaly access by myself.

In summary, it looks like this. https://github.com/chase0213/anomal_facebook_activity/blob/master/anomaly.py

So, here is the result of trying it out.

$ python anomaly.py
anomality(anomal datum): 2.33819794655
anomality(nomal datum): 0.999999880791

The second line gives the anomalous data defined above, and the third line gives the "ordinary data". For anomalous data, the degree of anomaly is clearly higher. Perhaps it would be better to set a statistical test or threshold to raise an alert.

That's all for trying out Jubatus, which is a little more useful than last time. Next time I would like to tune this a little more seriously.

As an aside, when I do a Google search for "Jubatus Anomaly", I feel that the page for the old API is hit. Please be aware that the API may not work properly depending on the version (this time).

Recommended Posts

I want to detect unauthorized login to facebook with Jubatus (1)
I want to detect objects with OpenCV
I want to do ○○ with Pandas
I want to debug with Python
I want to blog with Jupyter Notebook
I want to pip install with PythonAnywhere
I want to analyze logs with Python
I want to play with aws with python
I want to use MATLAB feval with python
I want to analyze songs with Spotify API 2
I tried to detect motion quickly with OpenCV
I want to mock datetime.datetime.now () even with pytest!
I want to display multiple images with matplotlib.
I want to knock 100 data sciences with Colaboratory
I want to make a game with Python
I want to be an OREMO with setParam!
I want to analyze songs with Spotify API 1
I want to use Temporary Directory with Python2
I tried to detect an object with M2Det!
I don't want to use -inf with np.log
#Unresolved I want to compile gobject-introspection with Python3
I want to use ip vrf with SONiC
I want to solve APG4b with Python (Chapter 2)
I want to start over with Django's Migrate
I want to write to a file with Python
I want to convert an image to WebP with lollipop
I want to detect images of cats from Instagram
I want to transition with a button in flask
I want to handle optimization with python and cplex
I want to climb a mountain with reinforcement learning
I want to inherit to the back with python dataclass
I want to work with a robot in python.
I want to split a character string with hiragana
I want to AWS Lambda with Python on Mac!
I want to manually create a legend with matplotlib
[TensorFlow] I want to process windows with Ragged Tensor
[ML Ops] I want to do multi-project with Python
I want to run a quantum computer with Python
I want to bind a local variable with lambda
I want to solve Sudoku (Sudoku)
I want to be able to analyze data with Python (Part 3)
I want to remove Python's Unresolved Import Warning with vsCode
I want to use R functions easily with ipython notebook
I want to specify another version of Python with pyvenv
I want to be able to analyze data with Python (Part 1)
I want to make a blog editor with django admin
I want to start a jupyter environment with one command
[NetworkX] I want to search for nodes with specific attributes
I want to make a click macro with pyautogui (desire)
I want to change the Japanese flag to the Palau flag with Numpy
I want to be able to analyze data with Python (Part 4)
I want to color black-and-white photos of memories with GAN
I want to be able to analyze data with Python (Part 2)
I want to automatically attend online classes with Python + Selenium!
I want to make a click macro with pyautogui (outlook)
[Python] I want to use the -h option with argparse
I want to use a virtual environment with jupyter notebook!
I want to install a package from requirements.txt with poetry
[Visualization] I want to draw a beautiful graph with Plotly
I want to terminate python's multiprocessing Pool with ctrl + c (KeyboardInterrupt)
I want to use a wildcard that I want to shell with Python remove