groupby in pandas Link to heading
It returns a Groupby Object to us. Whatever Groupby Object it is, we don’t care, it is basically a dictionary that maps your index to array of row numbers.
Any groupby operation involves one of the following operations on the original object. They are −
- Splitting the Object
- Applying a function
- Combining the results
In many situations, we split the data into sets and we apply some functionality on each subset. In the apply functionality, we can perform the following operations −
- Aggregation − computing a summary statistic
- Transformation − perform some group-specific operation
- Filtration − discarding the data with some condition
key point here is that groupby function kind of using ‘hash’ flavor.
you can also say groupby is Hierarchical Indexing
index in pandas Link to heading
using set_index() function can trigger multiindex. so feed in an array that pandas will create multiindex for you.
but this index is on every row.
groupby vs indexing Link to heading
I personally like groupby function because of quick. indexing as I said is given to every row, so not so efficient.
This post is inspired once I use index and my colleague stated that using groupby to take advantage of ‘hash’ flavor may be quicker.