poi数据的co-location空间数据挖掘分析

2023年8月8日下午4:22 • Python • 阅读 30

这是本人在论文中使用的代码，大体上拆成两部分进行：获取同为模式关系、输出同位模式结果。

仅供各位参考

附件 co_location_minner . py

importos

importtime

fromfunctoolsimportpartial

frommultiprocessingimportPoolasThreadPool2

importgeohash

importpandasaspd

fromefficient_aprioriimportapriori

fromgeopy.distanceimportgreat_circle

classColocationMining(object):

def__init__(self):

self.co_location_data_lst=[]

self.read_settings()

self.init_data()

definit_data(self):

self.process_cnt=0

self.read_files()

self.input_df_etl()

self.industry_map_dict=self.get_industry_map_dict()

self.center_industry_code=self.industry_map_dict[self.center_industry]

self.industry_map_dict_reverse={v:kfork,vinself.industry_map_dict.items()}

self.root_path=os.getcwd()

defread_settings(self):

try:

df_settings=pd.read_excel(‘./settings.xlsx’)

df_settings=df_settings.set_index(‘参数名称’)

self.min_distance=df_settings.loc[‘距离阈值’,’参数值’]

self.center_industry=df_settings.loc[‘中心poi行业大类’,’参数值’]

self.conf_threshold=df_settings.loc[‘最小置信度阈值’,’参数值’]

self.supp_threshold=df_settings.loc[‘最小支持度阈值’,’参数值’]

self.filepath=df_settings.loc[‘指定文件路径’,’参数值’]

exceptExceptionase:

print(‘读取配置文件出错！，错误详情：%s’%e)

time.sleep(10000)

defread_files(self):

try:

ifstr(self.filepath).endswith(‘.csv’):

self.df=pd.read_csv(r’%s’%self.filepath,encoding=’utf8′)

else:

self.df=pd.read_excel(r’%s’%self.filepath)

exceptExceptionase:

self.df=pd.DataFrame()

print(‘读取指定的文件失败！，错误详情：%s’%e)

time.sleep(10000)

definput_df_etl(self):

df=self.df

df[‘wgs84_lng’]=df[‘wgs84_lng’].astype(str)

df[‘wgs84_lat’]=df[‘wgs84_lat’].astype(str)

df[‘location’]=df[‘wgs84_lng’]+’|’+df[‘wgs84_lat’]

df[‘geohash’]=df[‘location’].apply(self.location2geohash)

df[‘行业大类’]=df[‘行业大类’].fillna(”)

df=df[df[‘行业大类’]!=”]

df_select=df[df[‘行业大类’]==self.center_industry]

self.df_select=df_select.reset_index(drop=True)

self.df_no_select=df[df[‘行业大类’]!=self.center_industry]

deflocation2geohash(self,location):

lng,lat=[float(x)forxinlocation.split(‘|’)]

_geohash=geohash.encode(latitude=lat,longitude=lng,precision=12)

return_geohash

defget_industry_map_dict(self):

industry_set=set(self.df[‘行业大类’])

self.industry_map_dict={}

i=0

forindustryinindustry_set:

self.industry_map_dict[industry]=i

i+=1

returnself.industry_map_dict

defcaculate_coords_distance(self,location1,location2):

location1=[float(x)forxinlocation1.split(“|”)]

location2=[float(x)forxinlocation2.split(“|”)]

location2.reverse()

location1.reverse()

d=great_circle(location1,location2).meters

returnd

defget_nearyby_pois(self,_geohash,location):

df_no_select=self.df_no_select

target_geohash=_geohash[:6]

df_nearby=df_no_select[df_no_select[‘geohash’].str.contains(target_geohash)]

ifdf_nearby.shape[0]>=1:

df_nearby[‘distance’]=df_nearby[‘location’].apply(partial(self.caculate_coords_distance,location))

df_nearby=df_nearby[df_nearby[‘distance’]

Original: https://blog.csdn.net/zccccccc1998/article/details/124070857
Author: 宗成1998
Title: poi数据的co-location空间数据挖掘分析

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/742664/

转载文章受原作者版权保护。转载请注明原作者出处！

python

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

Python中，如何使用 IPython 调试(debug)程序

关于IPython使用的入门文章，主要介绍了如何在程序代码中嵌入ipython用于调试，并分析了优点与不足。在 Python 中编程时，我会花费大量时间使用 IPython 及其…

Python 2023年5月24日
0067
【论文精读】TMI2021医学图像分割 SMU-Net

TMI2021医学图像分割论文 SMU-Net: Saliency-guided Morphology-aware U-Net for Breast Lesion Segmenta…

Python 2023年9月30日
0037
【Python】京东自动下单抢购脚本——双十一购物小技巧

最近种草一款富士📷已久，但限于富士产能，一直都没有等到开放购买，在尝试几次定闹钟到点准时抢购后，果断放弃，于是花了一个周末时间写了一个简易脚本，终于成为一名合格的”富家…

Python 2023年7月31日
00155
一个关于pyinstaller的 pathex 参数所引发的打包执行报no module name的异常错误

现象：最近将pyinsatller升级到最新的 Version: 5.0.1版本后（之前一直用的是3.5版本同样方法未遇到问题，今次更新到最新版本后5.0.1后打包就遇到问题，具…

Python 2023年5月24日
0060
深度剖析Java的volatile实现原理，再也不怕面试官问了

上篇文章我们讲了synchronized的用法和实现原理，我们总爱说synchronized是重量级锁，volatile是轻量级锁。为什么volatile是轻量级锁，体现在哪些方面…

Python 2023年10月16日
0044
CASS实用操作：绘制房子与绿地

在工作中有一些小技巧如果能够熟练使用可以提高我们的工作效率，今天来介绍一下cass的实用操作。 1 画房子直接输入ff ,按照我们需要花房屋的结构直接选择需要画的房屋。如果我们打…

Python 2023年9月19日
00125
用户行为分析的背景以及几种模型分析、实例分析——淘宝用户行为分析

这里写目录标题 1. 绪论 * 1.1了解用户行为分析 1.2用户行为分析的目的 2.用户行为分析的具体内容 * 2.1用户行为分析的指标 2.2用户行为分析模型 – …

Python 2023年9月3日
0067
Python中if __name__ == ‘__main__‘用法及原理解析

Date：2022.1.1Author：qyan.liTopic：浅析Python中 if __name__ == ‘__main__’的原理和用法Reference：https:…

Python 2023年8月2日
0028
js逆向签名参数解析驾考数据科目一科目三题库爬虫分享 python scrapy

注：本篇意在学习，如有侵权，请联系删除之前有用selenium抓取科目一试题，但是只能抓到题干和试题答案，抓不到试题分析还有答题技巧，因为接口中有一个叫做 _r 的签名参数是加密…

Python 2023年10月4日
0050
pytest系列教程——2、pytest断言的使用

上一章学习了pytest的基本用法，今天学习一下断言。 unitest单元测试框架中提供了丰富的断言方法，如assertEqual()、assertIn()、assertTrue(…

Python 2023年9月11日
0035
【合集】笔者送给读者的话

当同学们看到这页博客，那么恭喜你们，你们已经踏上了一条前程似锦却又辛苦无比的路可能同学们幻想中的程序员是这样的工资高，待遇好，技术黑客。 *高收入低消费，动不动就两三十万，年收…

Python 2023年6月10日
0056
python request post from 提交表单

前言一个http请求包括三个部分，为别为请求行，请求报头，消息主体，类似以下这样：请求行请求报头消息主体HTTP协议规定post提交的数据必须放在消息主体中，但是协议并没有规定必…

Python 2023年8月2日
0046
scrapy爬虫初探

今天先从实操作来讲述采用scrapy来实现对csdn博客的爬取，后续慢慢剖析scrapy爬虫的原理和结构。 1）环境搭建首先安装scrapy pip install scrapy…

Python 2023年10月4日
00115
Pandas数据清理，看这一篇就够了

作者介绍 @吃货第一名的Claire 美国德克萨斯大学奥斯汀分校商学院硕士；美国某物流公司数据分析师；负责数据收集、清理、分析、建模、可视化等；美剧重度爱好者，坚信美食能解决…

Python 2023年8月8日
0065
关于在django框架中在admin页面下添加自定义按钮并实现功能

关于如何在django中admin页面下添加自定义按钮并实现功能最近使用Django的admin页面开发了一个产品信息管理系统，由于需求的不断增加，需要在admin页面自定义一些…

Python 2023年8月5日
0097
巧用VBA实现：基于多个关键词模糊匹配Excel多行数据

在用Excel处理实际业务中，我们会碰到如下场景： 1、从一堆人名中找到包含某些关键字的名字； 2、从银行流水文件中根据【备注】字段找到包含某些关键字的，统一识别为【手续费业务】等…

Python 2023年10月19日
0037

2024 年 4 月
一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

poi数据的co-location空间数据挖掘分析

大家都在看