掌握Python正則表達式的必備技巧，輕鬆應對各種文本處理難題

最佳答案

引言

正則表達式（Regular Expression，簡稱 Regex）是Python中一種富強的文本處理東西，廣泛利用於字元串查抄、調換、驗證跟提取等任務。控制正則表達式，可能幫助開辟者更高效地處理文本數據，處理各種文本處理困難。

正則表達式基本

1. 正則表達式語法

正則表達式由壹般字元跟特別字元（元字元）構成。以下是一些常用的元字元及其含義：

.：婚配除換行符以外的咨意單個字元。
*：婚配前面的子表達式零次或多次。
+：婚配前面的子表達式一次或多次。
?：婚配前面的子表達式零次或一次。
^：婚配輸入字元串的開端地位。
$：婚配輸入字元串的結束地位。
[...]：婚配括弧內的咨意一個字元（字元類）。
{n}：婚配前面的子表達式剛好n次。
{n,}：婚配前面的子表達式至少n次。
{n,m}：婚配前面的子表達式至少n次，但不超越m次。

2. Python正則表達式模塊

Python中的正則表達式功能重要由 re 模塊供給。以下是 re 模塊中的一些常用函數：

re.match(pattern, string)：從字元串的開端地位婚配正則表達式。
re.search(pattern, string)：查抄全部字元串，前去第一個婚配的成果。
re.findall(pattern, string)：查抄全部字元串，前去全部婚配的成果列表。
re.sub(pattern, replacement, string)：調換字元串中全部婚配的子串。

高等技能

1. 分組與引用

正則表達式中的括弧用於創建分組，可能利用 () 停止分組。分組後，可能經由過程 \1、\2 等引用分組婚配的內容。

import re

text = "The rain in Spain falls mainly in the plain."
pattern = r"(\w+) in (\w+) falls"

matches = re.findall(pattern, text)
for match in matches:
    print(match)

2. 貪婪婚配與非貪婪婚配

貪婪婚配會婚配儘可能多的字元，而非貪婪婚配會婚配儘可能少的字元。可能經由過程在量詞前面增加 ? 來實現非貪婪婚配。

import re

text = "I have 3 apples and 2 oranges."
pattern = r"(\d+) apples and (\d+) oranges"

matches = re.findall(pattern, text)
for match in matches:
    print(match)

3. 編譯正則表達式

在處理大年夜量數據或須要多次利用同一正則表達式時，利用 re.compile() 編譯正則表達式可能進步效力。

import re

pattern = re.compile(r"(\d+) apples and (\d+) oranges")

text = "I have 3 apples and 2 oranges."
matches = pattern.findall(text)
for match in matches:
    print(match)

實戰案例

以下是一些利用正則表達式處理現實成績的案例：

1. 驗證郵箱地點

import re

email = "example@example.com"
pattern = r"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$"

if re.match(pattern, email):
    print("Valid email address")
else:
    print("Invalid email address")

2. 提取德律風號碼

import re

text = "My phone number is 123-456-7890."
pattern = r"\d{3}-\d{3}-\d{4}"

phone_numbers = re.findall(pattern, text)
for number in phone_numbers:
    print(number)

3. 調換敏感詞

import re

text = "This is a bad word: badword"
pattern = r"badword"

replaced_text = re.sub(pattern, "****", text)
print(replaced_text)

總結

控制Python正則表達式，可能幫助開辟者更高效地處理文本數據，處理各種文本處理困難。經由過程進修正則表達式的語法、高等技能跟實戰案例，可能輕鬆應對各種文本處理任務。