{ "cells": [ { "cell_type": "markdown", "id": "9cd25ae8-6076-41b4-b349-99850ac8204a", "metadata": {}, "source": [ "# 1. Pandas数据基础" ] }, { "cell_type": "markdown", "id": "a807c65a-87b7-4f3f-b4e9-3cd40729bcd5", "metadata": {}, "source": [ "- Pandas侧重于数据分析:业务背景\n", " -字段/行\n", " - 统计\n", " - 分组\n", " - 争对列与行有运算的概念\n", " - 存储多种用户类型" ] }, { "cell_type": "markdown", "id": "12f1dbf1-6c50-443a-ad2e-fed7cb5957ec", "metadata": {}, "source": [ "- 三个数据结构\n", " - Series:一维\n", " - `DataFrame:二维数据`\n", " - Panel:三维数据" ] }, { "cell_type": "code", "execution_count": 3, "id": "77ed3899-9333-4458-b611-6e43258196cc", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
列1列2
011
122
233
344
455
566
\n", "
" ], "text/plain": [ " 列1 列2\n", "0 1 1\n", "1 2 2\n", "2 3 3\n", "3 4 4\n", "4 5 5\n", "5 6 6" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "# arr = [\n", "# [1,2,3,4,5,6],\n", "# [1,2,3,4,5,6]\n", "# ]\n", "arr = {\n", " \"列1\":[1,2,3,4,5,6],\n", " \"列2\":[1,2,3,4,5,6]\n", "}\n", "pd_data = pd.DataFrame(data=arr)\n", "pd_data" ] }, { "cell_type": "markdown", "id": "5db2700f-e34a-4d24-9ff6-8ec6fc0ada45", "metadata": {}, "source": [ "# 加载数据" ] }, { "cell_type": "code", "execution_count": 5, "id": "12871386-0042-43af-bd5f-8065699080da", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
经度维度保留海拔天数日期时间
039.999840116.325001048739999.1202202009-07-0502:53:07
139.999899116.324809047739999.1202782009-07-0502:53:12
240.000017116.324672046839999.1203362009-07-0502:53:17
340.000234116.324729046039999.1203942009-07-0502:53:22
440.000363116.324670045039999.1204512009-07-0502:53:27
........................
230740.000403116.327255014939999.3228592009-07-0507:44:55
230840.000433116.327209015039999.3229172009-07-0507:45:00
230940.000443116.327186015039999.3229752009-07-0507:45:05
231040.000522116.327132014939999.3230322009-07-0507:45:10
231140.000543116.327148015039999.3230902009-07-0507:45:15
\n", "

2312 rows × 7 columns

\n", "
" ], "text/plain": [ " 经度 维度 保留 海拔 天数 日期 时间\n", "0 39.999840 116.325001 0 487 39999.120220 2009-07-05 02:53:07\n", "1 39.999899 116.324809 0 477 39999.120278 2009-07-05 02:53:12\n", "2 40.000017 116.324672 0 468 39999.120336 2009-07-05 02:53:17\n", "3 40.000234 116.324729 0 460 39999.120394 2009-07-05 02:53:22\n", "4 40.000363 116.324670 0 450 39999.120451 2009-07-05 02:53:27\n", "... ... ... .. ... ... ... ...\n", "2307 40.000403 116.327255 0 149 39999.322859 2009-07-05 07:44:55\n", "2308 40.000433 116.327209 0 150 39999.322917 2009-07-05 07:45:00\n", "2309 40.000443 116.327186 0 150 39999.322975 2009-07-05 07:45:05\n", "2310 40.000522 116.327132 0 149 39999.323032 2009-07-05 07:45:10\n", "2311 40.000543 116.327148 0 150 39999.323090 2009-07-05 07:45:15\n", "\n", "[2312 rows x 7 columns]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data = pd.read_csv(\"dataset/tra.plt\", skiprows=6, header=None, names=[\"经度\", \"维度\", \"保留\", \"海拔\", \"天数\", \"日期\", \"时间\"])\n", "data" ] }, { "cell_type": "markdown", "id": "825d873c-ea4d-478d-9ebd-954edbbbfeba", "metadata": {}, "source": [ "# 数据可视化" ] }, { "cell_type": "code", "execution_count": 15, "id": "112d77d5-326a-4157-8019-a1fb43c10167", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 39.99984 , 116.325001],\n", " [ 39.999899, 116.324809],\n", " [ 40.000017, 116.324672],\n", " ...,\n", " [ 40.000443, 116.327186],\n", " [ 40.000522, 116.327132],\n", " [ 40.000543, 116.327148]])" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data[[\"经度\", \"维度\"]].values" ] }, { "cell_type": "code", "execution_count": 14, "id": "f7b1b2d5-bbed-4c7e-9142-dc1e07878fce", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(2312, 2)" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data[[\"经度\", \"维度\"]].shape" ] }, { "cell_type": "markdown", "id": "cb73bdd9-46c6-452d-b53b-ad0b17e90ac2", "metadata": {}, "source": [ "# 数据分析" ] }, { "cell_type": "code", "execution_count": 18, "id": "6f195731-0957-44fd-99c7-5d5dfffab306", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
经度维度
039.999840116.325001
540.000565116.324806
\n", "
" ], "text/plain": [ " 经度 维度\n", "0 39.999840 116.325001\n", "5 40.000565 116.324806" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data.loc[[0, 5], [\"经度\",\"维度\"]]" ] }, { "cell_type": "code", "execution_count": 19, "id": "c6f51cfd-b901-496f-b4da-1984f43f4914", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
经度维度
039.999840116.325001
540.000565116.324806
\n", "
" ], "text/plain": [ " 经度 维度\n", "0 39.999840 116.325001\n", "5 40.000565 116.324806" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data.iloc[0:5, 0:2]" ] }, { "cell_type": "markdown", "id": "0a6f17f8-8dff-4def-bb32-79c2600e8e66", "metadata": {}, "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13" } }, "nbformat": 4, "nbformat_minor": 5 }