文章目录
专栏
Python零基础入门篇 🔥 Python网络蜘蛛 🔥 Python数据分析 Django基础入门宝典 🔥 小玩意儿 🔥 Web前端学习 tkinter学习笔记 Excel自动化处理
🌳 Long time no see
相逢即是缘,来者皆是客,确定不看完再走?😁
几日不见,如隔几日呀!近段因为课业繁忙,《小玩意儿》专栏停更了一段时间,实在是Sorry😬
不知道为什么,总想着更新《小玩意儿》专栏的文章,可能是因为很多人喜欢,也有可能是因为写这类的文章比较放松,总而言之,就是喜欢……
不过写文章的过程是轻松的,打代码就不一样了,你看那头发,长得就不轻松……
有时候一个Bug出现,可以成功拿走我生命中的两个小时,最重要的是代码的测试,测试的过程是宛如……不说了不说了,它直接拿走我的生命!
不过!经历了Bug无数次的疯狂抽打,这点对我来说不痛不痒(浅浅装一下)
; 🍄收!回归主题
今天给大家带来的是一个搬家脚本,搬家搬家,顾名思义就是搬家……
大概一周前,我无意间接触到文章搬家的这个”领域”,当时我还不知道搬家是啥,请原谅我没见过世面。后来慢慢了解,才知道”搬家”原来是将一个社区的文章搬到另一个社区中,悟了悟了!
buling~的一下,突然就想到,诶,我之前写过一个CSDN文章转移到印象笔记中的一个脚本,那我是不是也可以从这个基础上做一个自动搬家的脚本,将CSDN的文章搬到其他社区发表,后来看了一下其他社区,就随机抽了一下,一只羊、两只羊、三只羊……就抽到了 51CTO(不要问为什么,因为我也不知道)。
咱们说干就干,搬!!!
🍂 脚本代码的出生和结束
根据之前写的文章:CSDN文章自动转移到印象笔记?一怒之下的我”揍”出了代码~
本次的设计思路也差不多:
(1)用户输入搬家的文章数,获取目标文章的链接和标题;
(2)通过循环,进入每一篇文章,然后搬家;
(3)将搬家成功的文章标题保存下载,写入excel文件中,以后在搬家的过程中就会读取已搬过的文章,筛选出未搬的文章,这就避免了重复搬家的情况。
一一实现的过程请往下看👇
《小玩意儿》专栏的文章看来被Get_cookie.py下了咒,没错,又是它,实现自动登录的基石,想要了解可以看上边提到的文章哦!
话不多说,直接贴代码:
from selenium import webdriver
from time import sleep
import json
if __name__ == '__main__':
driver = webdriver.Chrome()
driver.maximize_window()
driver.get('https://passport.csdn.net/login?code=public')
sleep(10)
dictCookies = driver.get_cookies()
jsonCookies = json.dumps(dictCookies)
with open('csdn_cookies.txt', 'w') as f:
f.write(jsonCookies)
print('cookies保存成功!')
🌿真的要搬家了~
本次搬家的文章主要是对Markdown文章进行搬家,如果不是Markdown编辑器编写的文章搬不了哦~
🌱搬家工具介绍
- selenium模块:自动控制浏览器,实现对目标文章链接进行访问;
- time模块:控制基本进度,等待网页加载的作用;
- pyautogui:模拟键盘鼠标进行自动操作,主要进行读取保存的关键图片,根据图片查找屏幕中是否出现,出现后模拟键盘鼠标进行相应的操作;
- openpyxl模块:保存已搬家文章的标题,为以后筛选未搬家的文章;
- os模块:判断文件是否存在,不存在则创建;
- win32api和win32con模块:这两个模块搭配使用,实现鼠标自动滚动;
- json模块:获取Get_cookie.py文件运行后保存下的cookie值并转换格式,提供给selenium模块进行访问浏览器,达到自动登录的效果
🌼搬家过程
搬家准备
- 搬家前准备截取以下图片并保存,这是针对本博主的情况进行截图,大家可根据自己的情况进行设定哦(主要提供思路,大家尽情发挥)
; 开始搬家
- 第一步:运行程序,输入搬家文章的数量并回车
- 第二步:等待网页加载,手动选择专栏并点击搜索
- 第三步:手动打开新的浏览器,打开51CTO博客,打开发文章,停留在发文章的那一页,然后将两个浏览器页面摆好,51CTO需要露出头像,CSDN需要在水平方向压缩至最小
- 第四步:在代码程序中输入1,回车,然后让屏幕出现如上图所示的样子
- 效果展示
- 保存的已搬家文章
🌾搬家源码
提供思路,大家自由发挥哈~
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
import pyautogui
import openpyxl
import os
import win32api
import win32con
import json
class CSDN(object):
def __init__(self, artical_num):
self.driver = webdriver.Chrome()
self.artical_num = artical_num
self.page_urls = []
self.all_titles = []
self.To_transfer_url = []
self.To_transfer_title = []
self.max_row = 0
def login(self):
self.driver.get('https://mp.csdn.net/mp_blog/manage/article?')
with open('csdn_cookies.txt', 'r', encoding='utf8') as f:
listCookies = json.loads(f.read())
for cookie in listCookies:
cookie_dict = {
'domain': '.csdn.net',
'name': cookie.get('name'),
'value': cookie.get('value'),
"expires": '',
'path': '/',
'httpOnly': False,
'HostOnly': False,
'Secure': False
}
self.driver.add_cookie(cookie_dict)
self.driver.refresh()
def open_excel(self):
self.f = openpyxl.load_workbook('./已转移文章.xlsx')
self.sheet = self.f['Trans_artical']
self.Transferred_titles = [i.value for i in self.sheet['A']]
self.max_row = self.sheet.max_row
def save_excel(self):
for i in range(len(self.To_transfer_title)):
self.sheet.cell(self.max_row+i+1, 1).value = self.To_transfer_title[i]
self.f.save('./已转移文章.xlsx')
print('文件保存成功!')
def Filter_articles(self):
n = 0
for i in range(len(self.all_titles)):
if self.all_titles[i] not in self.Transferred_titles:
self.To_transfer_url.append(self.page_urls[i])
self.To_transfer_title.append(self.all_titles[i])
if n == self.artical_num:
break
n += 1
def parse_page(self):
"""
用户选择好分栏并点击后,输入1,程序继续运行
:return:
"""
WebDriverWait(self.driver, 1000).until(
EC.presence_of_element_located((By.XPATH, '//ul[@role="menu"]/li/a[text()="内容管理"]'))
)
self.driver.find_element(By.XPATH, '//ul[@role="menu"]/li/a[text()="内容管理"]').click()
print('\n请确认是否已选择专栏,点击搜索......')
user1 = input('确认无误后请输入1,进行下一步操作......:')
WebDriverWait(self.driver, 1000).until(
EC.presence_of_element_located((By.XPATH, '//p[@class="article-list-item-txt"]/a'))
)
time.sleep(2)
while True:
try:
article = self.driver.find_elements(By.XPATH, '//p[@class="article-list-item-txt"]/a')
self.page_urls += [ele.get_attribute('href') for ele in article]
self.all_titles += [i.text for i in article]
self.driver.find_element(By.XPATH, '//*[@id="view-containe"]/div/div/div[4]/div/button[2]').click()
time.sleep(3)
except:
break
self.page_urls = self.page_urls[::-1]
self.all_titles = self.all_titles[::-1]
def get_urls(self):
for i in range(self.artical_num):
self.driver.get(self.To_transfer_url[i])
time.sleep(3)
try:
button = self.driver.find_element(By.XPATH, '//div[@id="moreDiv"]/div[10]/div/div/div[2]/button')
if button.is_enabled():
continue
except:
self.move_to_51()
print(f'{self.To_transfer_title[i]} 搬家成功!')
def move_to_51(self):
r = None
while r is None:
r = pyautogui.locateOnScreen('./photo_51/1.png')
x, y = pyautogui.center(r)
pyautogui.doubleClick(x, y)
pyautogui.leftClick(x, y)
pyautogui.hotkey('ctrl', 'c')
time.sleep(1)
r = None
while r is None:
r = pyautogui.locateOnScreen('./photo_51/2.png')
x, y = pyautogui.center(r)
pyautogui.doubleClick(x, y)
pyautogui.hotkey('ctrl', 'v')
r = None
while r is None:
r = pyautogui.locateOnScreen('./photo_51/3.png')
x, y = pyautogui.center(r)
pyautogui.doubleClick(x, y)
pyautogui.hotkey('ctrl', 'a')
pyautogui.hotkey('ctrl', 'c')
r = None
while r is None:
r = pyautogui.locateOnScreen('./photo_51/4.png')
x, y = pyautogui.center(r)
pyautogui.leftClick(x, y)
pyautogui.hotkey('ctrl', 'v')
time.sleep(1)
r = None
while r is None:
r = pyautogui.locateOnScreen('./photo_51/5.png')
x, y = pyautogui.center(r)
pyautogui.leftClick(x, y)
for i in range(4):
r = pyautogui.locateOnScreen('./photo_51/19.png')
if r is not None:
x, y = pyautogui.center(r)
pyautogui.leftClick(x, y)
break
r = None
while r is None:
r = pyautogui.locateOnScreen('./photo_51/6.png')
x, y = pyautogui.center(r)
pyautogui.leftClick(x, y)
r = None
while r is None:
r = pyautogui.locateOnScreen('./photo_51/7.png')
x, y = pyautogui.center(r)
pyautogui.leftClick(x, y)
for i in range(1, 800):
win32api.mouse_event(win32con.MOUSEEVENTF_WHEEL, 0, 0, -1)
time.sleep(1)
r = None
while r is None:
r = pyautogui.locateOnScreen('./photo_51/9.png')
x, y = pyautogui.center(r)
pyautogui.leftClick(x, y)
r = None
while r is None:
r = pyautogui.locateOnScreen('./photo_51/10.png')
x, y = pyautogui.center(r)
pyautogui.leftClick(x, y)
r = None
while r is None:
r = pyautogui.locateOnScreen('./photo_51/11.png')
x, y = pyautogui.center(r)
pyautogui.leftClick(x, y)
r = None
while r is None:
r = pyautogui.locateOnScreen('./photo_51/12.png')
x, y = pyautogui.center(r)
pyautogui.leftClick(x, y)
r = None
while r is None:
r = pyautogui.locateOnScreen('./photo_51/17.png')
x, y = pyautogui.center(r)
pyautogui.leftClick(x, y)
pyautogui.press('enter')
r = None
while r is None:
r = pyautogui.locateOnScreen('./photo_51/14.png')
x, y = pyautogui.center(r)
pyautogui.doubleClick(x, y)
try:
r = None
for i in range(5):
r = pyautogui.locateOnScreen('./photo_51/18.png')
if r is not None:
pyautogui.doubleClick()
else:
break
except:
pass
pyautogui.leftClick()
r = None
while r is None:
r = pyautogui.locateOnScreen('./photo_51/15.png')
x, y = pyautogui.center(r)
pyautogui.doubleClick(x, y)
r = None
while r is None:
r = pyautogui.locateOnScreen('./photo_51/16.png')
x, y = pyautogui.center(r)
pyautogui.doubleClick(x, y)
def run(self):
self.login()
self.parse_page()
if os.path.exists('./已转移文章.xlsx'):
self.open_excel()
self.Filter_articles()
else:
self.To_transfer_url = self.page_urls[:self.artical_num]
self.To_transfer_title = self.all_titles[:self.artical_num]
self.f = openpyxl.Workbook()
self.sheet = self.f.create_sheet('Trans_artical')
self.get_urls()
self.save_excel()
self.driver.quit()
if __name__ == '__main__':
print('------ 搬家开始!请做好准备! ------')
artical_num = int(input('\n请输入搬家文章的数量:'))
time.sleep(2)
csdn = CSDN(artical_num)
csdn.run()
⏰结束语
非常感谢大家一直以来的支持💪 今天的分享就到这里啦!🚀
如果喜欢这篇文章,那就旋个三连吧~ 点赞👍 收藏🌈 关注哦💖 您的支持,就是我更新的最大动力!感谢🌹
“🚄See you next time💨”
peace~
Original: https://blog.csdn.net/Oh_Python/article/details/127713518
Author: IT工藤新一
Title: 古有愚公移山,今有冤种搬家~某人含泪写完了搬家脚本~~
原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/660920/
转载文章受原作者版权保护。转载请注明原作者出处!