Lab 07: Python Basics Lab 07:Python 基础

Test Your Knowledge测试你的知识

Ready to test what you've learned about Python basics?准备好测试你学到的 Python 基础知识了吗?

Take the Quiz (Members Only) 做测验(会员专属) PREMIUM

1. Getting Started with Python 1. Python 入门

Python is a high-level, interpreted programming language. In this lab, you'll learn to write Python scripts that process text files using command-line arguments.Python 是一种高级解释型编程语言。在本实验中,你将学习编写使用命令行参数处理文本文件的 Python 脚本。

  • First line must be #!/usr/bin/env python3 (shebang)第一行必须是 #!/usr/bin/env python3(shebang)
  • Must use chmod +x to make scripts executable必须用 chmod +x 让脚本可执行
  • No external programs allowed (shell, perl, etc.)不允许运行外部程序(shell、perl 等)
  • Indentation defines code blocks (no braces!)缩进定义代码块(没有大括号!)

2. Command-Line Arguments (sys.argv) 2. 命令行参数(sys.argv)

The sys module provides access to command-line arguments via sys.argv — a list of strings.sys 模块通过 sys.argv 提供对命令行参数的访问 — 这是一个字符串列表。

$./orca.py whales0.txt whales1.txt # sys.argv = ['./orca.py', 'whales0.txt', 'whales1.txt'] # sys.argv[0] = script name # sys.argv[1:] = file arguments

Key Pattern: Loop over file arguments关键模式:遍历文件参数

import sys total = 0 for filename in sys.argv[1:]: # Skip sys.argv[0] (script name) # process each file pass print(total)

⚠️ Common Mistake常见错误

sys.argv[0] is the script name itself, NOT the first argument. Use sys.argv[1:] to get file arguments.sys.argv[0] 是脚本名本身,不是第一个参数。用 sys.argv[1:] 获取文件参数。

3. File I/O (Reading Files) 3. 文件读写(读取文件)

Use with open() to safely read files. The file is automatically closed when the block ends.使用 with open() 安全读取文件。代码块结束时文件会自动关闭。

with open(filename) as f: for line in f: # Iterate line by line line = line.strip() # Remove trailing newline # process line

Why with?为什么用 with

It's a context manager — automatically closes the file even if an error occurs. No need to call f.close().它是上下文管理器 — 即使发生错误也会自动关闭文件。不需要调用 f.close()

Reading Entire File at Once 一次性读取整个文件

# Read all content as one string with open(filename) as f: content = f.read() # Read all lines into a list with open(filename) as f: lines = f.readlines()

4. String Methods 4. 字符串方法

split() — Split into parts split() — 拆分成部分

line = "18/01/18 9 Pygmy right whale" parts = line.split() # ['18/01/18', '9', 'Pygmy', 'right', 'whale'] date = parts[0] # '18/01/18' count = parts[1] # '9' species_parts = parts[2:] # ['Pygmy', 'right', 'whale']

join() — Combine parts join() — 合并部分

parts = ['Pygmy', 'right', 'whale'] species = ' '.join(parts) # 'Pygmy right whale' # ' ' is the separator (single space)

join() is called ON the separator string!join() 是在分隔符字符串上调用的!

The string before the dot is what goes BETWEEN elements. ' '.join(list) puts spaces between items. ''.join(list) concatenates with no separator.点号前面的字符串是放在元素之间的内容。' '.join(list) 在项之间放空格。''.join(list) 无分隔符拼接。

Other Useful String Methods 其他常用字符串方法

Method方法 Description说明 Example示例
strip() Remove leading/trailing whitespace去除首尾空白 " hello ".strip() → "hello"
lower() Convert to lowercase转换为小写 "Orca".lower() → "orca"
upper() Convert to uppercase转换为大写 "hi".upper() → "HI"
startswith(s) Check if starts with s检查是否以 s 开头 "hello".startswith("he") → True
endswith(s) Check if ends with s检查是否以 s 结尾 "python".endswith("on") → True
replace(a, b) Replace a with b将 a 替换为 b "cat".replace("c","b") → "bat"
repr() Escape string for Python code转义字符串用于 Python 代码 repr("he's") → "\"he's\""

5. Type Conversion 5. 类型转换

# String → Integer count = int(parts[1]) # "9" → 9 # Integer → String text = str(42) # 42 → "42" # Any → String (for printing) result = f"Found {total} orcas" # f-string formatting

⚠️ Common Error: TypeError常见错误:TypeError

"9" + 1 raises TypeError. You must convert first: int("9") + 1 → 10. Always use int() when reading numbers from files."9" + 1 会报 TypeError。必须先转换:int("9") + 1 → 10。从文件读取数字时务必使用 int()

6. Dictionaries 6. 字典

Dictionaries store key-value pairs. Perfect for counting / grouping data.字典存储键值对。非常适合计数/分组数据。

species_data = {} # Check if key exists, initialize if not if species not in species_data: species_data[species] = {'pods': 0, 'individuals': 0} # Update values species_data[species]['pods'] += 1 species_data[species]['individuals'] += count

Iterate & Sort 遍历与排序

# Iterate over sorted keys for species in sorted(species_data.keys()): data = species_data[species] print(f"{species} observations: {data['pods']} pods, {data['individuals']} individuals")

Dictionary Pattern for Counting字典计数模式

if key not in dict: dict[key] = initial_value — always check existence before accessing a new key, or use dict.setdefault(key, default) or collections.defaultdict.if key not in dict: dict[key] = initial_value — 访问新键之前一定要检查存在性,或者使用 dict.setdefault(key, default)collections.defaultdict

7. Regular Expressions (re module) 7. 正则表达式(re 模块)

The re module provides regex support in Python. re.findall() finds all matches of a pattern in a string.re 模块提供 Python 的正则表达式支持。re.findall() 找到字符串中所有匹配的模式。

import re # Read entire file with open(filename) as f: content = f.read() # Find all sequences of digits numbers = re.findall(r'\d+', content) # returns list of strings: ['42', '5', '100', ...] # Convert to integers and sum total = sum(int(n) for n in numbers) print(total)

Key Points for This Lab本实验要点

  • \d+ matches one or more digits (0-9 only)\d+ 匹配一个或多个数字(只有 0-9)
  • . and - are NOT digits — -42.5 matches as 42 and 5.- 不是数字 — -42.5 匹配为 425
  • re.findall() returns a list of strings, not integersre.findall() 返回字符串列表,不是整数
  • Use sum(int(n) for n in numbers) to convert and sum使用 sum(int(n) for n in numbers) 转换并求和

⚠️ Don't forget import re!别忘了 import re

Forgetting import re causes NameError: name 're' is not defined.忘记 import re 会导致 NameError: name 're' is not defined

8. Python Indentation 8. Python 缩进

Python uses indentation (4 spaces) to define code blocks. This is NOT optional — wrong indentation = syntax error.Python 使用缩进(4个空格)来定义代码块。这不是可选的 — 错误的缩进 = 语法错误。

total = 0 # Level 0 for filename in sys.argv[1:]: # Level 0 with open(filename) as f: # Level 1 for line in f: # Level 2 parts = line.split() # Level 3 if species == 'orca': # Level 3 total += count # Level 4 ← inside if print(total) # Level 0 ← outside all loops!

Hierarchy Analogy: Company层级类比:公司

  • Boss (Level 0): total = 0老板(第0层): total = 0
  • Manager (for): distributes tasks经理(for): 分配任务
  • Supervisor (with): opens files主管(with): 打开文件
  • Worker (for line): processes lines员工(for line): 处理行
  • Task (if): conditional action具体操作(if): 条件操作

⚠️ Where to put print()?print() 放哪里?

If print is indented inside the if, it prints every match. If at Level 0 (no indent), it prints once at the end. Match the indentation to when you want the code to run.如果 print 缩进在 if 里面,每次匹配都打印。如果在第0层(无缩进),最后只打印一次。缩进位置决定代码何时执行。

9. Lab Exercises Summary 9. 实验练习总结

Exercise练习 Key Concepts核心概念 Difficulty难度
orca.py sys.argv, open, for, split, int, ifsys.argv, open, for, split, int, if Basic基础
whale_summary.py dict, sorted, lower, endswith, joindict, sorted, lower, endswith, join Intermediate中等
summing_numbers.py re.findall, \d+, sum, generatorre.findall, \d+, sum, generator Basic基础
python_print.py repr(), f-strings, escapingrepr(), f-strings, 转义 Challenge挑战
python_print_n.py Nested code generation, recursion嵌套代码生成, 递归 Challenge挑战
python_print_inf.py Quine (self-printing program)Quine(自打印程序) Challenge挑战

Ready to Test?准备好测试了吗?

Take the quiz to check your understanding of Python basics!做测验来检验你对 Python 基础的理解!

Take the Quiz (Members Only) 做测验(会员专属) PREMIUM