Jump to content

User:Alvations/py cheatsheet

From Wikipedia, the free encyclopedia

I am sort of new to python programming though not new to programming so here I've compiled some recipes that I often used in my NLP pursuit. I also have another userpage for NLTK related cheatsheet.

String Manipulation

[edit]

Insert a substring into a string

[edit]

Often I want to insert a character or substring at a specified position. I've no idea why python strings doesn't have an in-built insert though.

# To insert a string into an original string at a specified position.
def insert(original, insertee, pos):
  return original[:pos:] + insertee + original[pos:]
print insert ("hoses", "r", 2)

Stripping multiple strings in a list

[edit]

As advised by eduffy, there is a simple way of stripping every string in a list. (source: http://stackoverflow.com/questions/12182777/is-there-a-better-way-to-use-strip-on-a-list-of-strings-python)

alist ["   foobar ","foo   ","  foo bar   ","   bar"]
alist = map(str.strip, alist) 
print list

List/Sets Manipulation

[edit]

Find the difference between 2 lists

[edit]

Without further ado, ...

# Returns the difference between 2 lists.
def getDiff(lista, listb):
  return [x for x in lista if x not in listb]

Find the overlap between 2 lists

[edit]

I'm kind of a Lesk algorithm junkie so I always want to find overlaps between 2 lists/maps/set/dicts/etc.

# Returns the overlap between 2 lists.
def getOverlap(lista, listb):
  return [x for x in lista if x in listb]

Dictionary/Collections Manipulation

[edit]

Find the first instance of a key given a value in a dictionary

[edit]
# Gets first instance of matching key given a value and a dictionary.    
def getKey(dic, value):
  return [k for k,v in sorted(dic.items()) if v == value]
[edit]

Most of the following installation related hint are based on my Ubuntu 12.04 LTS distro. The pyLucene recipes are independent on the OS but it's based on pylucene version 2.3.

How to install pyDev in Eclipse IDE?

[edit]

Firstly get the Eclipse IDE, I prefer to install and start the Eclipse IDE through the terminal by:

$ sudo apt-get install eclipse-platform
$ eclipse

To install pyDev, the pyDev mainsite has an installation page with instructions and screenshots, see http://pydev.org/manual_101_install.html

How to install pyLucene?

[edit]

The following instructions installs pylucene v2.3 based on openjdk-6

$ sudo apt-get install openjdk-6-jdk openjdk-6-jre-headless openjdk-6-jre-lib
$ sudo apt-get install ant ant-doc ant-optional
$ sudo apt-get install ivy ivy-doc
$ sudo apt-get install pylucene
$ sudo apt-get install python-dev
$ ldconfig -p | grep libjvm

Now you should be able to run a pylucene code on your machine, a sample code can be found on http://www.tomergabel.com/ScriptingLuceneWithPython.aspx

How to get rid of the wiggly lines for pylucene code in Eclipse IDE ?

[edit]

Scipy / Numpy Recipes

[edit]

How to plot a simple scatterplot with linear regression graph using two lists of x and y ?

[edit]
def plotGraph(xlist,ylist,label4x="x-axis",label4y="y-axis"):
  # Forces x and y values into a polynomial equation with x power 1.
  #  and automatically finds the m=gradient and b=y-intercept.
  (m,b)=polyfit(xlist,ylist,1) #print "Gradient:"+m, "y-intercept:"+b
  # Creates the regression line.
  yp=polyval([m,b],xlist)
  # Plots the line; Plots the scatter points.
  plot(xlist,yp); scatter(xlist,ylist)
  grid(True)
  # Provides captions for the axis labels.
  xlabel(label4x); ylabel(label4y)
  show()
  return None

How to convert two lists of x and y values into numpy data array list?

[edit]
import numpy as np
def list2data(xlist,ylist):
  arraylist = []
  for x,y in izip(xlist, ylist):
    arraylist.append([x,y])
  data = np.array(arraylist)
  return data

Check Numpy version

[edit]
>>> import scipy
>>> scipy.__version__
'0.13.3'
>>> import numpy
>>> numpy.__version__
'1.9.1'


Time Processes / Multithreading

[edit]

How to measure time taken between lines of code in python?

[edit]
import time
start = time.clock()
#your code here    
print time.clock() - start


Packaging

[edit]

http://the-hitchhikers-guide-to-packaging.readthedocs.org/en/latest/

https://caremad.io/2013/07/setup-vs-requirement/