Quick introduction to Pandas (create dataframe, assign values to dataframe cells, save dataframe as csv, load csv as dataframe)

Pandas is an open source library for python that provides new data structures and data analysis tools. Pandas provides three new data structures named series[1-D], dataframe[2D] and panel[3D] that are capable of holding any data type. Dataframe is the most commonly used pandas object. Dataframe can be visualized as a spreadsheet [2D structure with different datatype]. I would suggest you all to install the entire scipy stack before using pandas. More information is given at “How to install scipy stack ?”

Assuming that the installation is completed, let us get our hands dirty with pandas. Open your favorite python IDE. Let us first initialize a dataframe using python. The following piece of code initializes an empty pandas dataframe.

import pandas as pd

columns=['2002','2003','2004','2005']
index=['Zidane','Figo','Beckham','Totti']
df = pd.DataFrame(columns=columns,index=index)
print df

The output is given below:

1

This code creates a table with soccer stars names (rows) and the year(columns). We have just initialized the dataframe. Let us fill some of empty cells with the number of goals scored by the player.

import pandas as pd

columns=['2002','2003','2004','2005']
index=['Zidane','Figo','Beckham','Totti']
df = pd.DataFrame(columns=columns,index=index)

df['2002']['Zidane']=12
df['2003']['Figo']=11
df['2004']['Beckham']=8
df['2005']['Totti']=16

print df

The output is given below.

2

In the output, we see updated cells. So to update a particular cell, ‘df[columnname][rowname]=somevalue’ can be used. So far, we have initialized the dataframe and updated values. Now let us save the data frame to a csv file. This is pretty easy. After updating the values use the below function to save the csv file.

df.to_csv('out.csv', sep=',')

This will save the dataframe to csv automatically on the same directory as the python script. The csv file in LibreOffice Calc is displayed below.

3

So we have now saved the pandas dataframe to a csv file on hard-disk. Now let us load back the saved csv file back in to pandas as a dataframe.

import pandas as pd

new_df=pd.read_csv('out.csv',index_col=0)
print new_df

The output is given below.

4

Congrats! you have successfully created a pandas dataframe, updated the values in the dataframe, saved it to a csv file and loaded back the csv file as new dataframe in python.

How to install OpenCV 3.0 for Python 3.4 in Debian Jessie

OpenCV provides lots of functions for real-time computer vision and Image processing applications. It is written mainly for C++ but it can also be used with python with help of python bindings. There are already enough resources on web with instructions for installing OpenCV on Ubuntu.

Install OpenCV 3.0 and Python 2.7+ on Ubuntu (By Adrian of pyimagesearch.com)

Install OpenCV 3.0 and Python 3.4+ on Ubuntu ( By Adrian of pyimagesearch.com)

But installation of openCV  3.0 for Python 3.4 on Debian Jessie (8.x) proved to be little tricky. So I decided to write a tutorial for installing OpenCV 3 on Debian Jessie with Python 3 support.

  • First step is to install all the dependencies  needed. So Open the terminal, and drop to root shell (in case the OS has root user) or add ‘sudo’ before all the terminal commands in this tutorial. And also make sure that the multimedia, contrib and non-free repositories are enabled. By default, these repositories has to be added to Debian’s sources list (in the file /etc/apt/sources.list). For beginners, please use this website to generate sources.list contents. Further instructions are provided in this website. When everything is setup, run the following commands in the terminal as sudo or root user.


$ aptitude update


$ aptitude upgrade


$ aptitude install libjpeg-dev libtiff5-dev libjasper-dev libpng12-dev build-essential cmake git pkg-config libavcodec-dev libavformat-dev libswscale-dev libv4l-dev libatlas-base-dev gfortran python3.4-dev python3-numpy python3-scipy python3-matplotlib ipython3 python3-pandas ipython3-notebook python3-tk libtbb-dev libeigen3-dev yasm libfaac-dev libopencore-amrnb-dev libopencore-amrwb-dev libtheora-dev libvorbis-dev libxvidcore-dev libx264-dev libqt4-dev libqt4-opengl-dev sphinx-common texlive-latex-extra libv4l-dev libdc1394-22-dev

      • Then, download the latest version of OpenCV from the github page.
        Click here to start OpenCV download !
      • Now, download the OpenCV extra and non-free modules (needed for SIFT, SURF and other non-free algorithms).
        Download OpenCV non-free modules!
      • When the downloads are completed, Go to home folder, create a folder named software and extract the OpenCV archive inside the softwares folder and name the extracted folder as ‘opencv-3’
      • Inside the opencv-3 folder, create a new folder and name it as ‘build’. Extract the OpenCV extra modules archive inside this folder. The extracted folder is ‘ opencv_contrib-master’ by default
      • Navigate to the ‘build’ folder in terminal and drop to root shell or use sudo while entering the following commands in terminal window


$ cp /usr/include/x86_64-linux-gnu/python3.4m/pyconfig.h /usr/include/python3.4m/

      • Now we are ready to install the OpenCV. In the terminal window, enter the following command


$ cmake -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local -D INSTALL_C_EXAMPLES=ON -D INSTALL_PYTHON_EXAMPLES=ON -D OPENCV_EXTRA_MODULES_PATH=./opencv_contrib-master/modules -D BUILD_EXAMPLES=ON -D WITH_TBB=ON -D WITH_V4L=ON -D INSTALL_C_EXAMPLES=ON -D PYTHON_EXECUTABLE=/usr/bin/python3 -D PYTHON_LIBRARY=/usr/lib/x86_64-linux-gnu/libpython3.4m.so -D PYTHON_INCLUDE_DIR=/usr/include/python3.4m/ -D PYTHON_INCLUDE_DIR2=/usr/include/x86_64-linux-gnu/python3.4m/ -D PYTHON_NUMPY_INCLUDE_DIRS=/usr/lib/python3/dist-packages/numpy/core/include/ -D BUILD_opencv_java=OFF ..

The terminal output should be something similar to the below screenshot

post
Screenshot
  • Check if python 3 interpreter, libraries, numpy and packages path are detected and python for build is python3
  • To start the make process, enter the following command


$ make -j4

  • When the make process is finished successfully, enter the following command to complete the installation OpenCV 3.0 for python 3.4


$ make install

  • When installation process is finished enter the following commands to test if installation is successful


$ python3 (enters python shell)
>> import cv2
>> print (cv2.__version__)

 

The python interpreter prints the current version of OpenCV installed for Python 3. Thats all! You have successfully installed OpenCV 3.0 for Python 3.4 on Debian Jessie. Feel free to post your feedback. Thank you!

 

An elegant way to count the loop iterations in python 3

So far, when we had to count the number of iterations in a loop, we would have (at least most of us 🙂 ) done something similar to this.

list=['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p']
cnt=0

for item in list:
    cnt+=1
    print(cnt)

    if cnt%5==0:
        print ('Multiples of 5')

This piece of code initializes a list of alphabets, iterates over the list and prints whenever the counter ( cnt variable) is a multiple of five.

But there is an elegant and more pythonic way to do this using ‘enumerate()’ function. Try this.


list=['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p']

for count, item in enumerate(list):
    
    print (count)
    if count!=0 and count % 5 == 0:
        print ('Multiples of 5')