Post to Facebook with Python

Let’s take a quick look at a great tool to post to Facebook from Python called fbconsole. It’s fairly self explanatory, so I won’t bother going into much textual detail here. I’ll take you through some of the basic features and highlight as we go. In the following code snippet, we will post some data to our personal status feed, similar to how we push from KindleQuotes to Facebook when a user would like to share an item.

import fbconsole as F

# If you have an APP ID from Facebook, you can add that here.
# Otherwise, you will be posting as fbconsole.
F.APP_ID = '<YOURID>'

# Set some authentication parameters
# Here we will set the ability to publish to a stream
F.AUTH_SCOPE = ['publish_stream']

# Now we will login and get our OAuth token.
F.authenticate()

# After successful login, we can add our status update.
# Let's assume we have some neat data we want to push to the stream.
# We'll post our world_changing_data to our personal feed
F.post('/me/feed', {'message': world_changing_data})

# Want to logout and wipe your OAuth token?
F.logout()

We just completed a very simple data push to our Facebook status feed from within our python program. There is a lot more functionality to check out once you have the basic setup working in your program.

Cleansing HTML in Django Forms

We want to accept some HTML, while stripping out the rest, in our Django form. Let’s take a quick look at how we can introduce some basic HTML cleaning functionality to our Django ModelForm. We will be using Bleach to do all the dirty work.

from yourapp.models import YourModel
from django import forms
import bleach

class YourForm(ModelForm):
    class Meta:
        model = YourModel
    
    def bleachData(self, data, whitelist=[]):
        allowed = whitelist
        clean_data = bleach.clean(data, allowed)

        return clean_data

    def clean_somefield(self):
        somefield = self.cleaned_data['somefield']
        whitelist = ['b', 'i']
        somefield = self.beachData(somefield, whitelist)

        return somefield 

    def clean(self):
        cleaned_data = super(YourForm, self).clean()
        self.cleaned_data['somefield'] = self.clean_somefield()

        return self.cleaned_data

As you can see, we run our normal form validation methods and then initiate a post-cleanse cleanse by bleaching ‘somefield’ and allowing a whitelist of tags, bold and italics.

Analyzing Brute Force Attempts in BASH

If you have a public facing SSH server running on the standard port, your message log is probably filled with failed authentication attempts from brute force attacks. Let’s mash up some quick BASH commands to analyze this data. For our purposes, we’ll look at the top attacker IPs and the top usernames tried.

First, pull down all of your message files and decompress them in a working directory using bunzip2.

Once you have all your message logs ready, we will search through them and pull out all of the authentication failure entries and grab the IP and username for each attempt.

grep -r "authentication error" messages* | awk '{split($0,a," "); print a[NF],a[NF-2]}' > attempts

So we first use grep to look for failed authentications by searching recursively for the string “authentication error” in all of our message logs by using the wildcard. We then pipe this to awk and split each input line found into an array delimited by a whitespace. The last part of each line, and therefore our new array, goes something like: ‘authentication error for USERNAME from IP’. So to get the username and IP from our array we can use the array length variable NF and use that to index the variables we need. Here we grab the last using a[NF] and two in from that with a[NF-2]. Finally, we output this to a file called attempts.

Now, let’s use more BASH magic to do the analyses for us. Our attempts file now is in the format as follows: IP USERNAME. We want to see the top IPs and usernames and we can do that with some sorting commands.

cut -d ' ' -f2 attempts | sort | uniq -c | sort -nr > attempts_username
cut -d ' ' -f1 attempts | sort | uniq -c | sort -nr > attempts_ip

Here, we simply grab either the username in the second column or the IP in the first with cut, sort this data, prepend lines with the number of occurrences of each, and then sort by this occurrence number and output to a new file. You can now view attempts_username and attempts_ip to see the top usernames and IPs, respectively, of brute force attacks.

Lastly, we can associate the keep usernames and IPs together and sort on one or the other to see the correlation between the two. To end our initial analyses, we will sort on usernames and find out for the top attempted usernames, what are the top originating IPs.

 sort -k 2,2 attempts| uniq -c | sort -nr | head -n 10

Next time we will be using some GEOIP methods to see where our top attack attempts are originating.

Python – Dynamically Printing to Single Line of Stdout

It’s that time again; code snippet time. This is a short and sweet bit of a code that has some really practical and powerful implications.

When handling large amounts of data processing, it’s often desirable to keep the user informed of where things are in terms of progress. However, filling up the scrollback buffer with thousands of “Completed processing [item X]” or “X % completed” is not always desirable. So, how do we handle verbosity without filling scrollback?

Here, we will print data using stdout but we will continually use the same line of the terminal. It’s extremely simple; before each data print, we will move the cursor back to the beginning of the current line, cut out the entire line of text, and then print our current bit of data.

*Note: This is using ANSI escape sequences and therefore will work on any VT100 type terminal.

 

The code:

import sys

class Printer():
	"""
	Print things to stdout on one line dynamically
	"""

	def __init__(self,data):

		sys.stdout.write("\r\x1b[K"+data.__str__())
		sys.stdout.flush()

 

To output data we might do something like:

totalFiles = len(fileList)
currentFileNum = 1.0

for f in fileList:
	ProcessFile(f)
	currentPercent = currentFileNum/totalFiles*100
	output = "%f of %d completed." % (currentPercent,totalFiles)
	Printer(output)
	currentFileNum += 1

 

And here is an example of what our output would look like in a practical scenario.

Processing some data
Progressing along with dynamic status updates

Simple. You can test it out using an ‘for x in range(1,100)’ sort of statement as well. Now we can keep a persistent display of data processing or program progress without slamming the scrollback buffer of the user’s terminal.

Stumpy v1.4.1 – Critical bug fix

Stumpy has been updated to version 1.4.1 to fix a critical bug that was in all previous versions [Thanks, Erik!]. There was an issue with case-sensitivity in short url lookups that is now all patched up.

For general information on the newer v1.4+ Stumpy, see post here.

Get the newest, fixed version here.

Stumpy is Now Available!

Stumpy has gone v1.0 and is now available on Github. The project is a URL shortener that I originally wrote in Python and have recently rebuilt from scratch to use the Django Python web framework. Now that I’ve reached what I felt was 1.0 functional code, I’ve moved the repository to the Github public section and is available under a BSD license.

Stumpy goes open source on Github

In brief, I have sucked in the first parts of the README.textile file you can find in the repository which should do most of the explaining as to the what and why. The full file contains instructions on setting everything up and how to get testing it out. If you should have any questions, please feel free to contact me. However, I’ve tried to keep the README as clear as possible to be able to drop into a Django environment and the code should be pretty self explanatory.

STUMPy

Mutaku URL Shortener

WHAT IS STUMPY?

STUMPy is a URL shortener written by xiao_haozi (Mutaku) using the django python web framework.

WHY STUMPY?

There are many url shorteners out there, and STUMPy does not do anything groundbreaking.
However, there are several benefits that encouraged it’s development:

  • you keep all the data and can access it at your will
  • you keep all the code and can access/change it at your will
  • simple to use, simple to run, and simple code
  • because of it’s simplicity, it is easy to understand how url shorteners work and some of the possible optimizations
  • uses the django framework which allows for easy expansion, management, and tweaking
  • django also allows for a nice web UI for administration of all of the data

REQUIREMENTS:

Everything is still in active development, so feel free to watch the repository to keep up with it’s evolution on Github. I will soon have the active Mutaku version of Stumpy up and running which I will post as soon as it’s up so that you can see the code in action.

Head over to the home at Github and clone or fork the source and have at it. Please be sure to add any comments or problems to the issues page on Github and feel free to send in pull requests if you’ve added any fun, interesting, or helpful code.

Working with XLS in Python

As I’m sure you are well aware, Python has modules for just about any task. That’s one of the features of the language which makes it so efficient. In science, like most technical environments, we are dealing with lots of data and often in formats or layouts which are sub-optimal.

The following example code is a quick program I wrote a while back for dealing with Excel-based data from a plate reader setup. In short, reading through the code, you can see that we merely are moving data around by iteration sets and writing this correctly ordered data to a new Excel sheet. Data was being read through and written to the original sheet sequentially by time, but we needed to have our data written out by sample set which would be offset by both the number of replicants and the number of conditions.

We use the XLRD and XLWT modules to read and write Excel files, respectively.  The iteration process for parsing the data is not exactly of great interest to you I’m sure, however, the reading and writing of the Excel files will hopefully be of some help. Also note that you can apply some styling to your newly created sheet by using the Easyxf function of XLWT.

Now, for the code.
Continue reading “Working with XLS in Python”