{"id":429,"date":"2023-04-17T02:11:21","date_gmt":"2023-04-16T21:11:21","guid":{"rendered":"https:\/\/immadshahid.com\/?p=429"},"modified":"2023-04-17T02:11:21","modified_gmt":"2023-04-16T21:11:21","slug":"pandas-cheat-sheet","status":"publish","type":"post","link":"https:\/\/immadshahid.com\/blog\/pandas-cheat-sheet\/","title":{"rendered":"Pandas Cheat Sheet"},"content":{"rendered":"\n<p>Here are the functions in the pandas&#8217; module or library:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import pandas as pd\n\n<strong>DATA FRAME\n<\/strong>data = {'Country': &#91;'Belgium',  'India',  'Brazil'],\n\n'Capital': &#91;'Brussels',  'New Delhi',  'Brasilia'],\n\n'Population': &#91;11190846, 1303171035, 207847528]} \n\ndf = pd.DataFrame(data,columns=&#91;'Country',  'Capital',  'Population'])\n\n<strong>READ AND WRITE CSV<\/strong>\n\npd.read_csv('file.csv', header=None, nrows=5)\ndf.to_csv('myDataFrame.csv')\n\n<strong>Read multiple sheets from the same file<\/strong>\n\nxlsx = pd.ExcelFile('file.xls')\ndf = pd.read_excel(xlsx,  'Sheet1')\n\n<strong>Read and Write to Excel<\/strong>\n\npd.read_excel('file.xlsx')\ndf.to_excel('dir\/myDataFrame.xlsx',  sheet_name='Sheet1')\n\n<strong>Read and Write to SQL Query or Database Table\n<\/strong>(read_sql()is a convenience wrapper around read_sql_table() and read_sql_query())\n\nfrom sqlalchemy import create_engine\nengine = create_engine('sqlite:\/\/\/:memory:')\npd.read_sql(SELECT * FROM my_table;, engine)\npd.read_sql_table('my_table', engine)\npd.read_sql_query(SELECT * FROM my_table;', engine)\ndf.to_sql('myDf', engine)\n\n\n<strong>Selection\n<\/strong><em>Getting\nGet one element<\/em>\n\ns&#91;'b']\n-5\n\n<strong>Get subset of a DataFrame\n<\/strong>\ndf&#91;1:]\nCountry     Capital   Population\n1  India    New Delhi 1303171035\n2  Brazil   Brasilia  207847528\n\n\n<strong>Selecting', Boolean Indexing and Setting\n<\/strong><em>By Position\nSelect single value by row and and column<\/em>\n\ndf.iloc(&#91;0], &#91;0])\n'Belgium'\ndf.iat(&#91;0], &#91;0])\n'Belgium'\n\n<strong>By Label\n<\/strong><em>Select single value by row and column labels\n<\/em>\ndf.loc(&#91;0],  &#91;'Country'])\n'Belgium'\ndf.at(&#91;0],  &#91;'Country'])\n'Belgium'\n\n<strong>By Label\/Position\n<\/strong><em>Select single row of subset of rows\n<\/em>\ndf.ix&#91;2]\nCountry      Brazil\nCapital    Brasilia\nPopulation  207847528\n\n<strong>Select a single column of subset of columns\n<\/strong>\ndf.ix&#91;:, 'Capital']\n0     Brussels\n1    New Delhi\n2     Brasilia\n\n<strong>Select rows and columns\n<\/strong>\ndf.ix&#91;1, 'Capital']\n'New Delhi'\n\n<strong>Boolean Indexing\n<\/strong><em>Series s where value is not >1\n<\/em>\ns&#91;~(s > 1)]\n\n<em>s where value is &lt;-1 or >2\n<\/em>\ns&#91;(s &lt; -1) | (s > 2)]\n\n<strong>Use filter to adjust DataFrame\n<\/strong>\ndf&#91;df&#91;'Population']>1200000000]\nPOWERED BY DATACAMP WORKSPACE\n\n<em>Set index a of Series s to 6\n<\/em>\ns&#91;'a'] = 6\n\n<strong>Dropping\n<\/strong>Drop values from rows (axis=0)\n\ns.drop(&#91;'a',  'c'])\n\n<em>Drop values from columns(axis=1)\n<\/em>\ndf.drop('Country', axis=1) \n\n<strong>Sort and Rank\n<\/strong><em>Sort by labels along an axis\n<\/em>\ndf.sort_index()\n\n<em>Sort by the values along an axis\n<\/em>\ndf.sort_values(by='Country') \n\n<strong>Assign ranks to entries\n<\/strong>\ndf.rank()\n\n<strong>Retrieving Series\/DataFrame Information\n<\/strong><em>Basic Information\n(rows, columns)\n<\/em>\ndf.shape\n\n<strong>Describe index\n<\/strong>\ndf.index\n\n<strong>Describe DataFrame columns\n<\/strong>\ndf.columns\n\n<strong>Info on DataFrame\n<\/strong>\ndf.info()\n\n<strong>Number of non-NA values\n<\/strong>\ndf.count()\n\n<strong>Summary\n<\/strong><em><strong>Sum of values\n<\/strong><\/em>\ndf.sum()\n\n<em><strong>Cumulative sum of values\n<\/strong><\/em>\ndf.cumsum()\n\n<em><strong>Minimum\/maximum values\n<\/strong><\/em>\ndf.min()\/df.max()\n\n<strong><em>Minimum\/Maximum index value\n<\/em><\/strong>\ndf.idxmin()\/df.idxmax() \n\n<strong>Summary statistics\n<\/strong>\ndf.describe()\n\n<em><strong>Mean of values\n<\/strong><\/em>\ndf.mean()\n\n<strong><em>Median of values\n<\/em><\/strong>\ndf.median()\n\n<strong>Applying Functions\n<\/strong>f = lambda x: x*2\n\n<strong><em>Apply function\n<\/em><\/strong>\ndf.apply(f)\n\n<strong><em>Apply function element-wise\n<\/em><\/strong>\ndf.applynap(f) \n\n<strong>Internal Data Alignment\n<\/strong><em>NA values are introduced in the indices that don't overlap:\n<\/em>\ns3 = pd.Series(&#91;7, -2, 3],  index=&#91;'a',  'c',  'd'])\ns + s3\na     10.0\nb     NaN\nc     5.0\nd     7.0\n\n<strong>Arithmetic Operations with Fill Methods\n<\/strong><em>You can also do the internal data alignment yourself with the help of the fill methods:<\/em>\n\ns.add(s3, fill_value=0)\na    10.0\nb    -5.0\nc    5.0\nd    7.0\ns.sub(s3, fill_value=2)\ns.div(s3, fill_value=4)\ns.mul(s3, fill_value=3)\n\n\n<strong>\n<\/strong><a href=\"https:\/\/pandas.pydata.org\/Pandas_Cheat_Sheet.pdf\">https:\/\/pandas.pydata.org\/Pandas_Cheat_Sheet.pdf<\/a>\n\n\n<strong><em>SOURCE: DATACAMP, PANDAS.PYDATA.ORG\r\n<\/em><\/strong>-----------------------------------------------------------------------\nimmadshahid.com<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Here are the functions in the pandas&#8217; module or library:<\/p>\n","protected":false},"author":1,"featured_media":430,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[130,124,128,120,92,54,129,127],"tags":[132,131,126],"class_list":["post-429","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-cheat-sheet","category-data-analytics","category-data-science","category-data-sscience","category-earning-and-learning","category-education","category-pandas","category-python","tag-data-science","tag-pandas","tag-python"],"_links":{"self":[{"href":"https:\/\/immadshahid.com\/blog\/wp-json\/wp\/v2\/posts\/429","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/immadshahid.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/immadshahid.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/immadshahid.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/immadshahid.com\/blog\/wp-json\/wp\/v2\/comments?post=429"}],"version-history":[{"count":1,"href":"https:\/\/immadshahid.com\/blog\/wp-json\/wp\/v2\/posts\/429\/revisions"}],"predecessor-version":[{"id":431,"href":"https:\/\/immadshahid.com\/blog\/wp-json\/wp\/v2\/posts\/429\/revisions\/431"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/immadshahid.com\/blog\/wp-json\/wp\/v2\/media\/430"}],"wp:attachment":[{"href":"https:\/\/immadshahid.com\/blog\/wp-json\/wp\/v2\/media?parent=429"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/immadshahid.com\/blog\/wp-json\/wp\/v2\/categories?post=429"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/immadshahid.com\/blog\/wp-json\/wp\/v2\/tags?post=429"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}