[python] 正则

帖子由 **rxxx** » 2025年 3月 18日 11:55 星期二

[python] 正则

# 获取url
import re
定义正则表达式模式
pattern = re.compile(r'^https?://.*$')

在上述代码中，我们使用了正则表达式^https?://.*$来进行匹配，其中：
^ 表示字符串的起始位置。
http 表示匹配字符串"http"。
s? 表示字符s可有可无。
:// 表示匹配字符串"://"。
.* 表示匹配任意数量的字符。
$ 表示字符串的结束位置。
这个正则表达式的意思是：以"http://"或"https://"开头的任意字符串

url = re.search(r'\[(.*?)\]', line).group(1)

period = re.search(r'package=\'(.*?)\'', line).group(1)

**search不能用时，试试findall
这个就是search抓不到邮箱，findall可用
email = re.findall('email (.*)', cachelines[i+j]) #email yhwoahg864@gmail.com
print(email[0])

#自用抓取lx音源js里面的API后面参数时，使用的
API_URL_add = re.findall(r'\{API_URL\}(.*?)url/\$\{source\}/',musicDownloader_content)

for old_name in old_name_list:
new_name = f'platform: "-咪咕-",'
url_content = re.sub(old_name, new_name, url_content)

article_url = re.findall("https://www.cfmem.com/[^<>\\r\\n]+vpn.html",res.text)[0]

#字符串中如果出现了 ‘[’‘]’
需要用这个
原字符串：'platform: "[酷我]",'
if '[' in old_name:
old_name = re.sub('\[', '\\[', old_name)
old_name = re.sub('\]', '\\]', old_name)
#改完[]再替换才行
file_content = re.sub(old_name, new_name, file_content)