Saturday, April 1, 2017

How can I create a DataFrame slice object piece by piece?

Leave a Comment

I have a DataFrame, and I want to select certain rows and columns from it. I know how to do this using loc. However, I want to be able to specify each criteria individually, rather than in one go.

import numpy as np import pandas as pd idx = pd.IndexSlice  index = [np.array(['foo', 'foo', 'qux', 'qux']),          np.array(['a', 'b', 'a', 'b'])] columns = ["A",  "B"] df = pd.DataFrame(np.random.randn(4, 2), index=index, columns=columns) print df print df.loc[idx['foo', :], idx['A':'B']]                A         B foo a  0.676649 -1.638399     b -0.417915  0.587260 qux a  0.294555 -0.573041     b  1.592056  0.237868                 A         B foo a -0.470195 -0.455713     b  1.750171 -0.409216 

Requirement

I want to be able to achieve the same result with something like the following bit of code, where I specify each criteria one by one. It's also important that I'm able to use a slice_list to allow dynamic behaviour [i.e. the syntax should work whether there are two, three or ten different criteria in the slice_list].

slice_1 = 'foo' slice_2 = ':' slice_list = [slice_1, slice_2]  column_slice = "'A':'B'" print df.loc[idx[slice_list], idx[column_slice]] 

2 Answers

Answers 1

You can achieve this using the slice built-in function. You can't build slices with strings as ':' is a literal character and not a syntatical one.

slice_1 = 'foo' slice_2 = slice(None) column_slice = slice('A', 'B') df.loc[idx[slice_1, slice_2], idx[column_slice]] 

Answers 2

You might have to build your "slice lists" a little differently than you intended, but here's a relatively compact method using df.merge() and df.ix[]:

# Build a "query" dataframe slice_df = pd.DataFrame(index=[['foo','qux','qux'],['a','a','b']]) # Explicitly name columns column_slice = ['A','B']  slice_df.merge(df, left_index=True, right_index=True, how='inner').ix[:,column_slice]  Out[]:                A         B foo a  0.442302 -0.949298 qux a  0.425645 -0.233174     b -0.041416  0.229281 

This method also requires you to be explicit about your second index and columns, unfortunately. But computers are great at making long tedious lists for you if you ask nicely.

If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment