linux shell

关注公众号 jb51net

关闭
首页 > 脚本专栏 > linux shell > Shell脚本字符串处理技巧

Shell脚本中字符串处理的实用技巧分享

作者:Jinkxs

在 Linux 系统管理和自动化运维的世界中,Shell 脚本是不可或缺的工具,而字符串处理,则是 Shell 编程中最常用、最核心的功能之一,本文将深入探讨 Shell 脚本中各种实用的字符串处理技巧,需要的朋友可以参考下

前言

在 Linux 系统管理和自动化运维的世界中,Shell 脚本是不可或缺的工具。而字符串处理,则是 Shell 编程中最常用、最核心的功能之一。无论是日志分析、配置文件修改、用户输入校验,还是系统状态监控,都离不开对字符串的灵活操作。

本文将深入探讨 Shell 脚本中各种实用的字符串处理技巧,从基础语法到高级实战,结合 Java 代码对比示例,帮助你全面掌握这一关键技能。无论你是刚入门的新手,还是有一定经验的开发者,都能从中获得实用的知识和灵感。

字符串基础:定义与赋值

在 Shell 中,字符串是最基本的数据类型。你可以使用单引号、双引号或不加引号来定义字符串:

#!/bin/bash
str1='Hello World'          # 单引号:内容原样输出
str2="Hello $USER"          # 双引号:支持变量扩展
str3=Hello\ World           # 不加引号:需转义空格
str4="Today is $(date)"     # 支持命令替换
echo "$str1"
echo "$str2"
echo "$str3"
echo "$str4"

注意:单引号内的所有内容都会被当作字面量,不会进行变量替换或命令执行。

Java 对比示例:

public class StringBasics {
    public static void main(String[] args) {
        String str1 = "Hello World";              // 相当于 Shell 单引号
        String user = System.getProperty("user.name");
        String str2 = "Hello " + user;            // 相当于 Shell 双引号变量拼接
        String str3 = "Hello World";              // Java 中空格无需转义
        String str4 = "Today is " + java.time.LocalDate.now(); // 命令替换对应
        System.out.println(str1);
        System.out.println(str2);
        System.out.println(str3);
        System.out.println(str4);
    }
}

字符串长度计算

获取字符串长度是常见的需求,在 Shell 中有多种方式:

#!/bin/bash

text="Hello, Shell Programming!"

# 方法1:使用 ${#variable}
len1=${#text}
echo "方法1 - 长度: $len1"

# 方法2:使用 expr length
len2=$(expr length "$text")
echo "方法2 - 长度: $len2"

# 方法3:使用 wc -m(注意会包含换行符)
len3=$(echo -n "$text" | wc -m)
echo "方法3 - 长度: $len3"

# 方法4:使用 awk
len4=$(echo "$text" | awk '{print length}')
echo "方法4 - 长度: $len4"

实战应用:密码强度检查

#!/bin/bash

check_password_strength() {
    local password="$1"
    local length=${#password}
    
    if [ $length -lt 8 ]; then
        echo "❌ 密码太短,至少需要8个字符"
        return 1
    elif [ $length -lt 12 ]; then
        echo "⚠️ 密码长度合格,但建议使用12位以上"
        return 0
    else
        echo "✅ 密码长度优秀!"
        return 0
    fi
}

# 测试
check_password_strength "abc123"
check_password_strength "mySecurePassword123"

Java 对比示例:

public class StringLength {
    public static void checkPasswordStrength(String password) {
        int length = password.length();
        if (length < 8) {
            System.out.println("❌ 密码太短,至少需要8个字符");
        } else if (length < 12) {
            System.out.println("⚠️ 密码长度合格,但建议使用12位以上");
        } else {
            System.out.println("✅ 密码长度优秀!");
        }
    }
    public static void main(String[] args) {
        checkPasswordStrength("abc123");
        checkPasswordStrength("mySecurePassword123");
        // 其他长度计算方法
        String text = "Hello, Java Programming!";
        System.out.println("长度: " + text.length());
        System.out.println("字节数: " + text.getBytes().length);
    }
}

字符串截取与切片

Shell 提供了强大的字符串截取功能,让你可以轻松提取子字符串。

基础截取语法

#!/bin/bash

str="Hello World Shell Scripting"

# 从位置0开始,取5个字符
substr1=${str:0:5}
echo "前5个字符: $substr1"  # 输出: Hello

# 从位置6开始到结尾
substr2=${str:6}
echo "从第7个字符开始: $substr2"  # 输出: World Shell Scripting

# 从倒数第8个字符开始,取6个字符
substr3=${str: -8:6}
echo "倒数第8个开始取6个: $substr3"  # 输出: Script

# 仅指定起始位置(到结尾)
substr4=${str:12}
echo "从第13个字符开始: $substr4"  # 输出: Shell Scripting

实战应用:文件路径处理

#!/bin/bash

process_filepath() {
    local filepath="$1"
    
    echo "原始路径: $filepath"
    
    # 获取文件名(最后一个/之后的内容)
    filename=${filepath##*/}
    echo "文件名: $filename"
    
    # 获取目录路径(最后一个/之前的内容)
    dirpath=${filepath%/*}
    echo "目录路径: $dirpath"
    
    # 获取文件扩展名
    extension=${filename##*.}
    echo "扩展名: $extension"
    
    # 获取不含扩展名的文件名
    basename=${filename%.*}
    echo "基础文件名: $basename"
}

# 测试
process_filepath "/home/user/documents/report.pdf"
process_filepath "/var/log/system.log"

使用 cut 命令截取

#!/bin/bash

data="John,Doe,30,Engineer,New York"

# 按逗号分隔,取第1字段
first_name=$(echo "$data" | cut -d',' -f1)
echo "名字: $first_name"

# 取第1-2字段
name=$(echo "$data" | cut -d',' -f1-2)
echo "姓名: $name"

# 取第3和第5字段
age_city=$(echo "$data" | cut -d',' -f3,5)
echo "年龄和城市: $age_city"

# 按字符位置截取
char_slice=$(echo "$data" | cut -c1-10)
echo "前10个字符: $char_slice"

Java 对比示例:

public class StringSubstring {
    public static void processFilepath(String filepath) {
        System.out.println("原始路径: " + filepath);
        // 获取文件名
        int lastSlash = filepath.lastIndexOf('/');
        String filename = lastSlash == -1 ? filepath : filepath.substring(lastSlash + 1);
        System.out.println("文件名: " + filename);
        // 获取目录路径
        String dirpath = lastSlash == -1 ? "" : filepath.substring(0, lastSlash);
        System.out.println("目录路径: " + dirpath);
        // 获取文件扩展名
        int lastDot = filename.lastIndexOf('.');
        String extension = lastDot == -1 ? "" : filename.substring(lastDot + 1);
        System.out.println("扩展名: " + extension);
        // 获取不含扩展名的文件名
        String basename = lastDot == -1 ? filename : filename.substring(0, lastDot);
        System.out.println("基础文件名: " + basename);
    }
    public static void main(String[] args) {
        String str = "Hello World Shell Scripting";
        // 基础截取
        System.out.println("前5个字符: " + str.substring(0, 5));
        System.out.println("从第7个字符开始: " + str.substring(6));
        // 处理CSV数据
        String data = "John,Doe,30,Engineer,New York";
        String[] fields = data.split(",");
        System.out.println("名字: " + fields[0]);
        System.out.println("姓名: " + fields[0] + "," + fields[1]);
        System.out.println("年龄和城市: " + fields[2] + "," + fields[4]);
        System.out.println("前10个字符: " + data.substring(0, Math.min(10, data.length())));
        // 处理文件路径
        processFilepath("/home/user/documents/report.pdf");
    }
}

字符串查找与定位

在 Shell 中查找字符串的位置和判断是否存在是常见需求。

使用 expr index 查找字符位置

#!/bin/bash

text="Hello World"

# 查找第一个 'o' 的位置(从1开始计数)
pos=$(expr index "$text" "o")
echo "第一个 'o' 的位置: $pos"

# 查找字符集中的任意字符
pos2=$(expr index "$text" "aeiou")
echo "第一个元音字母的位置: $pos2"

# 判断字符是否存在
if [ $pos -gt 0 ]; then
    echo "✅ 找到了 'o'"
else
    echo "❌ 未找到 'o'"
fi

使用 grep 判断字符串存在

#!/bin/bash

check_string_contains() {
    local haystack="$1"
    local needle="$2"
    
    if echo "$haystack" | grep -q "$needle"; then
        echo "✅ '$needle' 存在于 '$haystack'"
        return 0
    else
        echo "❌ '$needle' 不存在于 '$haystack'"
        return 1
    fi
}

# 测试
check_string_contains "Hello World" "World"
check_string_contains "Hello World" "Python"
check_string_contains "The quick brown fox" "quick"

查找所有匹配位置

#!/bin/bash

find_all_positions() {
    local text="$1"
    local search="$2"
    local pos=0
    local found=0
    
    echo "在 '$text' 中查找 '$search':"
    
    while true; do
        # 从当前位置开始查找
        temp=${text:$pos}
        index=$(expr index "$temp" "$search")
        
        if [ $index -eq 0 ]; then
            break
        fi
        
        actual_pos=$((pos + index - 1))
        echo "  位置 $actual_pos"
        pos=$((actual_pos + 1))
        found=1
    done
    
    if [ $found -eq 0 ]; then
        echo "  未找到匹配项"
    fi
}

# 测试
find_all_positions "banana" "a"
find_all_positions "programming" "m"

Java 对比示例:

import java.util.ArrayList;
import java.util.List;
public class StringSearch {
    public static boolean containsString(String haystack, String needle) {
        boolean found = haystack.contains(needle);
        if (found) {
            System.out.println("✅ '" + needle + "' 存在于 '" + haystack + "'");
        } else {
            System.out.println("❌ '" + needle + "' 不存在于 '" + haystack + "'");
        }
        return found;
    }
    public static List<Integer> findAllPositions(String text, String search) {
        List<Integer> positions = new ArrayList<>();
        int index = 0;
        System.out.println("在 '" + text + "' 中查找 '" + search + "':");
        while ((index = text.indexOf(search, index)) != -1) {
            positions.add(index);
            System.out.println("  位置 " + index);
            index += 1; // 移动到下一个位置
        }
        if (positions.isEmpty()) {
            System.out.println("  未找到匹配项");
        }
        return positions;
    }
    public static void main(String[] args) {
        String text = "Hello World";
        // 查找字符位置
        int pos = text.indexOf('o');
        System.out.println("第一个 'o' 的位置: " + (pos + 1)); // Java从0开始,Shell从1开始
        // 查找元音字母
        String vowels = "aeiou";
        int firstVowelPos = -1;
        for (int i = 0; i < text.length(); i++) {
            if (vowels.indexOf(text.charAt(i)) != -1) {
                firstVowelPos = i;
                break;
            }
        }
        System.out.println("第一个元音字母的位置: " + (firstVowelPos + 1));
        // 测试包含判断
        containsString("Hello World", "World");
        containsString("Hello World", "Python");
        // 查找所有位置
        findAllPositions("banana", "a");
        findAllPositions("programming", "m");
    }
}

字符串替换与修改

Shell 提供了多种字符串替换的方法,从简单的全局替换到复杂的正则表达式替换。

基础替换语法

#!/bin/bash

text="Hello World, Hello Shell, Hello Linux"

# 替换第一个匹配项
result1=${text/Hello/Hi}
echo "替换第一个: $result1"

# 替换所有匹配项
result2=${text//Hello/Hi}
echo "替换所有: $result2"

# 替换开头匹配项
result3=${text/#Hello/Hi}
echo "替换开头: $result3"

# 替换结尾匹配项
result4=${text/%Linux/Unix}
echo "替换结尾: $result4"

# 删除匹配项(替换为空)
result5=${text//Hello/}
echo "删除所有Hello: $result5"

使用 sed 进行高级替换

#!/bin/bash

text="User: John, Age: 30, City: New York"

# 基础替换
result1=$(echo "$text" | sed 's/John/Jane/g')
echo "替换名字: $result1"

# 使用正则表达式
result2=$(echo "$text" | sed 's/Age: [0-9][0-9]/Age: 35/g')
echo "更新年龄: $result2"

# 多次替换
result3=$(echo "$text" | sed -e 's/John/Jane/g' -e 's/30/35/g' -e 's/New York/Boston/g')
echo "多重替换: $result3"

# 条件替换
result4=$(echo "$text" | sed '/John/s/Age: 30/Age: 35/')
echo "条件替换: $result4"

实战应用:批量重命名文件

#!/bin/bash

rename_files() {
    local old_pattern="$1"
    local new_pattern="$2"
    local directory="${3:-.}"
    
    echo "在目录 '$directory' 中将 '$old_pattern' 替换为 '$new_pattern'"
    
    for file in "$directory"/*; do
        if [ -f "$file" ]; then
            filename=$(basename "$file")
            if [[ "$filename" == *"$old_pattern"* ]]; then
                new_filename=${filename//$old_pattern/$new_pattern}
                mv "$file" "$directory/$new_filename"
                echo "✓ 重命名: $filename → $new_filename"
            fi
        fi
    done
}

# 使用示例(注释掉实际执行,仅显示逻辑)
# rename_files "draft_" "final_" "./documents"
# rename_files ".txt" ".md" "./notes"

Java 对比示例:

public class StringReplace {
    public static void main(String[] args) {
        String text = "Hello World, Hello Shell, Hello Linux";
        // 替换第一个匹配项
        String result1 = text.replaceFirst("Hello", "Hi");
        System.out.println("替换第一个: " + result1);
        // 替换所有匹配项
        String result2 = text.replaceAll("Hello", "Hi");
        System.out.println("替换所有: " + result2);
        // 替换开头匹配项
        String result3 = text.startsWith("Hello") ? "Hi" + text.substring(5) : text;
        System.out.println("替换开头: " + result3);
        // 替换结尾匹配项
        String result4 = text.endsWith("Linux") ? 
            text.substring(0, text.length() - 5) + "Unix" : text;
        System.out.println("替换结尾: " + result4);
        // 删除匹配项
        String result5 = text.replaceAll("Hello", "");
        System.out.println("删除所有Hello: " + result5);
        // 使用正则表达式
        String userData = "User: John, Age: 30, City: New York";
        String result6 = userData.replaceAll("John", "Jane");
        System.out.println("替换名字: " + result6);
        String result7 = userData.replaceAll("Age: \\d+", "Age: 35");
        System.out.println("更新年龄: " + result7);
        // 多重替换
        String result8 = userData
            .replace("John", "Jane")
            .replace("30", "35")
            .replace("New York", "Boston");
        System.out.println("多重替换: " + result8);
    }
}

正则表达式匹配

Shell 中的正则表达式功能强大,可以帮助你进行复杂的字符串模式匹配。

基础正则匹配

#!/bin/bash

validate_email() {
    local email="$1"
    local email_regex='^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
    
    if [[ "$email" =~ $email_regex ]]; then
        echo "✅ 有效的邮箱地址: $email"
        return 0
    else
        echo "❌ 无效的邮箱地址: $email"
        return 1
    fi
}

validate_phone() {
    local phone="$1"
    local phone_regex='^(\+?1[-.\s]?)?\(?([0-9]{3})\)?[-.\s]?([0-9]{3})[-.\s]?([0-9]{4})$'
    
    if [[ "$phone" =~ $phone_regex ]]; then
        echo "✅ 有效的电话号码: $phone"
        # 提取区号
        echo "区号: ${BASH_REMATCH[2]}"
        return 0
    else
        echo "❌ 无效的电话号码: $phone"
        return 1
    fi
}

# 测试
validate_email "user@example.com"
validate_email "invalid.email"
validate_phone "123-456-7890"
validate_phone "(555) 123-4567"
validate_phone "+1 555 123 4567"

使用 grep 进行正则匹配

#!/bin/bash

# 创建测试数据
cat > test_data.txt << EOF
john.doe@example.com
invalid.email
jane.smith@company.org
user123@gmail.com
not_an_email
admin@site.co.uk
EOF

echo "=== 邮箱验证结果 ==="
while IFS= read -r line; do
    if echo "$line" | grep -qE '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'; then
        echo "✅ $line"
    else
        echo "❌ $line"
    fi
done < test_data.txt

# 提取特定格式的数据
echo -e "\n=== 提取.com邮箱 ==="
grep -E '@[^.]*\.com$' test_data.txt

# 统计匹配数量
count=$(grep -cE '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$' test_data.txt)
echo -e "\n有效邮箱总数: $count"

# 清理测试文件
rm test_data.txt

复杂模式匹配示例

#!/bin/bash

analyze_log_entry() {
    local log_entry="$1"
    
    # 匹配日志格式: [日期 时间] [级别] 消息
    local log_pattern='^\[([0-9]{4}-[0-9]{2}-[0-9]{2}) ([0-9]{2}:[0-9]{2}:[0-9]{2})\] \[([A-Z]+)\] (.*)$'
    
    if [[ "$log_entry" =~ $log_pattern ]]; then
        echo "✅ 有效的日志条目"
        echo "日期: ${BASH_REMATCH[1]}"
        echo "时间: ${BASH_REMATCH[2]}"
        echo "级别: ${BASH_REMATCH[3]}"
        echo "消息: ${BASH_REMATCH[4]}"
        return 0
    else
        echo "❌ 无效的日志格式: $log_entry"
        return 1
    fi
}

# 测试
analyze_log_entry "[2024-01-15 14:30:25] [ERROR] Database connection failed"
analyze_log_entry "[2024-01-15 14:31:10] [INFO] User login successful"
analyze_log_entry "Invalid log format"

Java 对比示例:

import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexMatching {
    public static boolean validateEmail(String email) {
        String emailRegex = "^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$";
        Pattern pattern = Pattern.compile(emailRegex);
        Matcher matcher = pattern.matcher(email);
        if (matcher.matches()) {
            System.out.println("✅ 有效的邮箱地址: " + email);
            return true;
        } else {
            System.out.println("❌ 无效的邮箱地址: " + email);
            return false;
        }
    }
    public static boolean validatePhone(String phone) {
        String phoneRegex = "^(\\+?1[-.\\s]?)?\\(?([0-9]{3})\\)?[-.\\s]?([0-9]{3})[-.\\s]?([0-9]{4})$";
        Pattern pattern = Pattern.compile(phoneRegex);
        Matcher matcher = pattern.matcher(phone);
        if (matcher.matches()) {
            System.out.println("✅ 有效的电话号码: " + phone);
            // 提取区号
            System.out.println("区号: " + matcher.group(2));
            return true;
        } else {
            System.out.println("❌ 无效的电话号码: " + phone);
            return false;
        }
    }
    public static void analyzeLogEntry(String logEntry) {
        // 匹配日志格式: [日期 时间] [级别] 消息
        String logPattern = "^\\[([0-9]{4}-[0-9]{2}-[0-9]{2}) ([0-9]{2}:[0-9]{2}:[0-9]{2})\\] \\[([A-Z]+)\\] (.*)$";
        Pattern pattern = Pattern.compile(logPattern);
        Matcher matcher = pattern.matcher(logEntry);
        if (matcher.matches()) {
            System.out.println("✅ 有效的日志条目");
            System.out.println("日期: " + matcher.group(1));
            System.out.println("时间: " + matcher.group(2));
            System.out.println("级别: " + matcher.group(3));
            System.out.println("消息: " + matcher.group(4));
        } else {
            System.out.println("❌ 无效的日志格式: " + logEntry);
        }
    }
    public static void main(String[] args) {
        // 测试邮箱验证
        validateEmail("user@example.com");
        validateEmail("invalid.email");
        // 测试电话验证
        validatePhone("123-456-7890");
        validatePhone("(555) 123-4567");
        validatePhone("+1 555 123 4567");
        // 测试日志分析
        analyzeLogEntry("[2024-01-15 14:30:25] [ERROR] Database connection failed");
        analyzeLogEntry("[2024-01-15 14:31:10] [INFO] User login successful");
        analyzeLogEntry("Invalid log format");
        // 使用正则表达式提取数据
        String testData = "john.doe@example.com\ninvalid.email\njane.smith@company.org";
        String[] lines = testData.split("\n");
        System.out.println("\n=== 邮箱验证结果 ===");
        int validCount = 0;
        for (String line : lines) {
            if (validateEmail(line)) {
                validCount++;
            }
        }
        System.out.println("\n有效邮箱总数: " + validCount);
    }
}

字符串分割与数组处理

将字符串分割成数组是处理结构化数据的重要技能。

使用 IFS 分割字符串

#!/bin/bash

split_with_ifs() {
    local string="$1"
    local delimiter="$2"
    
    echo "原始字符串: $string"
    echo "分隔符: '$delimiter'"
    
    # 保存原来的IFS
    OLD_IFS=$IFS
    IFS=$delimiter
    
    # 分割字符串
    read -ra parts <<< "$string"
    
    # 恢复IFS
    IFS=$OLD_IFS
    
    echo "分割结果:"
    for i in "${!parts[@]}"; do
        echo "  [$i]: '${parts[i]}'"
    done
    
    echo "总共有 ${#parts[@]} 个部分"
}

# 测试不同分隔符
split_with_ifs "apple,banana,cherry,date" ","
echo ""
split_with_ifs "red|green|blue|yellow" "|"
echo ""
split_with_ifs "one two three four" " "

使用 cut 和 awk 分割

#!/bin/bash

process_csv_data() {
    local csv_line="$1"
    
    echo "CSV数据: $csv_line"
    
    # 使用cut分割
    name=$(echo "$csv_line" | cut -d',' -f1)
    age=$(echo "$csv_line" | cut -d',' -f2)
    city=$(echo "$csv_line" | cut -d',' -f3)
    
    echo "使用cut分割:"
    echo "  姓名: $name"
    echo "  年龄: $age"
    echo "  城市: $city"
    
    # 使用awk分割
    echo "使用awk分割:"
    echo "$csv_line" | awk -F',' '{
        printf "  姓名: %s\n", $1
        printf "  年龄: %s\n", $2
        printf "  城市: %s\n", $3
        printf "  总字段数: %d\n", NF
    }'
}

# 测试
process_csv_data "张三,25,北京"
echo ""
process_csv_data "李四,30,上海"

复杂分割场景

#!/bin/bash

parse_config_line() {
    local config_line="$1"
    
    echo "配置行: $config_line"
    
    # 处理 key=value 格式
    if [[ "$config_line" == *"="* ]]; then
        # 使用IFS分割
        OLD_IFS=$IFS
        IFS='='
        read -ra parts <<< "$config_line"
        IFS=$OLD_IFS
        
        if [ ${#parts[@]} -eq 2 ]; then
            key=$(echo "${parts[0]}" | xargs)  # 去除前后空格
            value=$(echo "${parts[1]}" | xargs)  # 去除前后空格
            
            echo "键: '$key'"
            echo "值: '$value'"
            
            # 进一步处理值(如果包含逗号分隔的列表)
            if [[ "$value" == *","* ]]; then
                OLD_IFS=$IFS
                IFS=','
                read -ra list_items <<< "$value"
                IFS=$OLD_IFS
                
                echo "列表项:"
                for item in "${list_items[@]}"; do
                    trimmed_item=$(echo "$item" | xargs)
                    echo "  - $trimmed_item"
                done
            fi
        fi
    fi
}

# 测试
parse_config_line "database.host=localhost"
echo ""
parse_config_line "allowed.origins=http://example.com, https://api.example.com, http://localhost:3000"
echo ""
parse_config_line "max.connections=100"

数组操作技巧

#!/bin/bash

array_operations() {
    local string="$1"
    local delimiter="$2"
    
    echo "=== 数组操作演示 ==="
    echo "源字符串: $string"
    
    # 创建数组
    OLD_IFS=$IFS
    IFS=$delimiter
    read -ra arr <<< "$string"
    IFS=$OLD_IFS
    
    echo "数组内容: ${arr[*]}"
    echo "数组长度: ${#arr[@]}"
    
    # 遍历数组
    echo "遍历数组:"
    for i in "${!arr[@]}"; do
        echo "  索引 $i: ${arr[i]}"
    done
    
    # 添加元素
    arr+=("new_element")
    echo "添加元素后: ${arr[*]}"
    
    # 删除元素
    unset 'arr[1]'  # 删除索引1的元素
    echo "删除索引1后: ${arr[*]}"
    
    # 连接数组元素
    joined=$(IFS=$delimiter; echo "${arr[*]}")
    echo "重新连接: $joined"
    
    # 排序数组
    sorted=($(for i in "${arr[@]}"; do echo "$i"; done | sort))
    echo "排序后: ${sorted[*]}"
}

# 测试
array_operations "zebra,apple,banana,cherry" ","

Java 对比示例:

import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;
public class StringSplitting {
    public static void splitWithString(String string, String delimiter) {
        System.out.println("原始字符串: " + string);
        System.out.println("分隔符: '" + delimiter + "'");
        // 分割字符串
        String[] parts = string.split(delimiter);
        System.out.println("分割结果:");
        for (int i = 0; i < parts.length; i++) {
            System.out.println("  [" + i + "]: '" + parts[i] + "'");
        }
        System.out.println("总共有 " + parts.length + " 个部分");
    }
    public static void processCsvData(String csvLine) {
        System.out.println("CSV数据: " + csvLine);
        // 分割CSV
        String[] fields = csvLine.split(",");
        System.out.println("分割结果:");
        System.out.println("  姓名: " + fields[0]);
        System.out.println("  年龄: " + fields[1]);
        System.out.println("  城市: " + fields[2]);
        System.out.println("  总字段数: " + fields.length);
    }
    public static void parseConfigLine(String configLine) {
        System.out.println("配置行: " + configLine);
        // 处理 key=value 格式
        if (configLine.contains("=")) {
            String[] parts = configLine.split("=", 2);  // 最多分割成2部分
            String key = parts[0].trim();
            String value = parts[1].trim();
            System.out.println("键: '" + key + "'");
            System.out.println("值: '" + value + "'");
            // 进一步处理值(如果包含逗号分隔的列表)
            if (value.contains(",")) {
                String[] listItems = value.split(",");
                System.out.println("列表项:");
                for (String item : listItems) {
                    System.out.println("  - " + item.trim());
                }
            }
        }
    }
    public static void arrayOperations(String string, String delimiter) {
        System.out.println("=== 数组操作演示 ===");
        System.out.println("源字符串: " + string);
        // 创建数组
        String[] arr = string.split(delimiter);
        System.out.println("数组内容: " + Arrays.toString(arr));
        System.out.println("数组长度: " + arr.length);
        // 遍历数组
        System.out.println("遍历数组:");
        for (int i = 0; i < arr.length; i++) {
            System.out.println("  索引 " + i + ": " + arr[i]);
        }
        // 添加元素(需要创建新数组)
        String[] newArr = Arrays.copyOf(arr, arr.length + 1);
        newArr[arr.length] = "new_element";
        System.out.println("添加元素后: " + Arrays.toString(newArr));
        // 删除元素(需要创建新数组)
        List<String> list = Arrays.asList(newArr).stream()
            .filter((item, index) -> index != 1)
            .collect(Collectors.toList());
        System.out.println("删除索引1后: " + list);
        // 连接数组元素
        String joined = String.join(delimiter, newArr);
        System.out.println("重新连接: " + joined);
        // 排序数组
        String[] sorted = newArr.clone();
        Arrays.sort(sorted);
        System.out.println("排序后: " + Arrays.toString(sorted));
    }
    public static void main(String[] args) {
        // 测试不同分隔符
        splitWithString("apple,banana,cherry,date", ",");
        System.out.println();
        splitWithString("red|green|blue|yellow", "\\|");
        System.out.println();
        splitWithString("one two three four", " ");
        // 测试CSV处理
        System.out.println();
        processCsvData("张三,25,北京");
        // 测试配置解析
        System.out.println();
        parseConfigLine("database.host=localhost");
        System.out.println();
        parseConfigLine("allowed.origins=http://example.com, https://api.example.com, http://localhost:3000");
        // 测试数组操作
        System.out.println();
        arrayOperations("zebra,apple,banana,cherry", ",");
    }
}

字符串大小写转换

在文本处理中,经常需要转换字符串的大小写格式。

使用 tr 命令转换

#!/bin/bash

convert_case() {
    local text="$1"
    
    echo "原始文本: $text"
    
    # 转换为大写
    upper=$(echo "$text" | tr '[:lower:]' '[:upper:]')
    echo "大写: $upper"
    
    # 转换为小写
    lower=$(echo "$text" | tr '[:upper:]' '[:lower:]')
    echo "小写: $lower"
    
    # 首字母大写
    first_upper=$(echo "$text" | sed 's/.*/\L&/; s/[a-z]/\U&/')
    echo "首字母大写: $first_upper"
    
    # 每个单词首字母大写
    title_case=$(echo "$text" | awk '{for(i=1;i<=NF;i++) $i=toupper(substr($i,1,1)) tolower(substr($i,2));}1')
    echo "标题格式: $title_case"
}

# 测试
convert_case "hello world shell scripting"
echo ""
convert_case "PYTHON JAVA JAVASCRIPT"

使用 Bash 内建功能(4.0+版本)

#!/bin/bash

# 检查Bash版本
echo "Bash版本: $BASH_VERSION"

if [[ ${BASH_VERSION%%.*} -ge 4 ]]; then
    echo "使用Bash 4.0+内建大小写转换功能"
    
    text="hello WORLD Shell Scripting"
    echo "原始文本: $text"
    
    # 转换为大写
    echo "大写: ${text^^}"
    
    # 转换为小写
    echo "小写: ${text,,}"
    
    # 首字母大写
    echo "首字母大写: ${text^}"
    
    # 首字母小写
    echo "首字母小写: ${text,}"
    
    # 对特定字符进行大小写转换
    echo "转换特定字符(l→L): ${text^^l}"
    echo "转换特定字符(S→s): ${text,,S}"
else
    echo "Bash版本低于4.0,使用tr命令替代"
    text="hello WORLD Shell Scripting"
    echo "大写: $(echo "$text" | tr '[:lower:]' '[:upper:]')"
    echo "小写: $(echo "$text" | tr '[:upper:]' '[:lower:]')"
fi

实战应用:标准化文件名

#!/bin/bash

standardize_filename() {
    local filename="$1"
    local format="$2"  # snake, kebab, camel, pascal
    
    echo "原始文件名: $filename"
    echo "目标格式: $format"
    
    # 移除扩展名
    name="${filename%.*}"
    extension="${filename##*.}"
    has_extension=$([ "$filename" != "$name" ] && echo "true" || echo "false")
    
    case $format in
        "snake")
            # 转换为snake_case
            result=$(echo "$name" | tr '[:upper:]' '[:lower:]' | sed 's/[^a-zA-Z0-9]/_/g' | sed 's/__*/_/g' | sed 's/^_//;s/_$//')
            ;;
        "kebab")
            # 转换为kebab-case
            result=$(echo "$name" | tr '[:upper:]' '[:lower:]' | sed 's/[^a-zA-Z0-9]/-/g' | sed 's/--*/-/g' | sed 's/^-//;s/-$//')
            ;;
        "camel")
            # 转换为camelCase
            temp=$(echo "$name" | tr '[:upper:]' '[:lower:]' | sed 's/[^a-zA-Z0-9]/ /g')
            result=$(echo "$temp" | awk '{
                for(i=1; i<=NF; i++) {
                    if(i==1) {
                        printf "%s", $i
                    } else {
                        printf "%s%s", toupper(substr($i,1,1)), substr($i,2)
                    }
                }
            }')
            ;;
        "pascal")
            # 转换为PascalCase
            temp=$(echo "$name" | tr '[:upper:]' '[:lower:]' | sed 's/[^a-zA-Z0-9]/ /g')
            result=$(echo "$temp" | awk '{
                for(i=1; i<=NF; i++) {
                    printf "%s%s", toupper(substr($i,1,1)), substr($i,2)
                }
            }')
            ;;
        *)
            echo "未知格式: $format"
            return 1
            ;;
    esac
    
    # 重新添加扩展名
    if [ "$has_extension" = "true" ]; then
        result="$result.$extension"
    fi
    
    echo "标准化后: $result"
    return 0
}

# 测试不同格式
standardize_filename "My Document FINAL v2.pdf" "snake"
echo ""
standardize_filename "user_profile_settings.json" "kebab"
echo ""
standardize_filename "HELLO-WORLD-TEST.TXT" "camel"
echo ""
standardize_filename "data backup 2024.csv" "pascal"

Java 对比示例:

import java.util.Arrays;
import java.util.regex.Pattern;
public class CaseConversion {
    public static void convertCase(String text) {
        System.out.println("原始文本: " + text);
        // 转换为大写
        String upper = text.toUpperCase();
        System.out.println("大写: " + upper);
        // 转换为小写
        String lower = text.toLowerCase();
        System.out.println("小写: " + lower);
        // 首字母大写
        String firstUpper = text.isEmpty() ? text : 
            text.substring(0, 1).toUpperCase() + text.substring(1).toLowerCase();
        System.out.println("首字母大写: " + firstUpper);
        // 每个单词首字母大写
        String titleCase = Arrays.stream(text.split("\\s+"))
            .map(word -> word.isEmpty() ? word : 
                word.substring(0, 1).toUpperCase() + word.substring(1).toLowerCase())
            .collect(Collectors.joining(" "));
        System.out.println("标题格式: " + titleCase);
    }
    public static String standardizeFilename(String filename, String format) {
        System.out.println("原始文件名: " + filename);
        System.out.println("目标格式: " + format);
        // 分离文件名和扩展名
        int lastDot = filename.lastIndexOf('.');
        String name = lastDot == -1 ? filename : filename.substring(0, lastDot);
        String extension = lastDot == -1 ? "" : filename.substring(lastDot);
        boolean hasExtension = lastDot != -1;
        String result;
        switch (format.toLowerCase()) {
            case "snake":
                // 转换为snake_case
                result = name.toLowerCase()
                    .replaceAll("[^a-zA-Z0-9]", "_")
                    .replaceAll("_+", "_")
                    .replaceAll("^_|_$", "");
                break;
            case "kebab":
                // 转换为kebab-case
                result = name.toLowerCase()
                    .replaceAll("[^a-zA-Z0-9]", "-")
                    .replaceAll("-+", "-")
                    .replaceAll("^-|-$", "");
                break;
            case "camel":
                // 转换为camelCase
                String[] words = name.toLowerCase().split("[^a-zA-Z0-9]+");
                StringBuilder camelBuilder = new StringBuilder();
                for (int i = 0; i < words.length; i++) {
                    if (i == 0) {
                        camelBuilder.append(words[i]);
                    } else if (!words[i].isEmpty()) {
                        camelBuilder.append(
                            words[i].substring(0, 1).toUpperCase() + 
                            words[i].substring(1)
                        );
                    }
                }
                result = camelBuilder.toString();
                break;
            case "pascal":
                // 转换为PascalCase
                String[] pascalWords = name.toLowerCase().split("[^a-zA-Z0-9]+");
                StringBuilder pascalBuilder = new StringBuilder();
                for (String word : pascalWords) {
                    if (!word.isEmpty()) {
                        pascalBuilder.append(
                            word.substring(0, 1).toUpperCase() + 
                            word.substring(1)
                        );
                    }
                }
                result = pascalBuilder.toString();
                break;
            default:
                System.out.println("未知格式: " + format);
                return filename;
        }
        // 重新添加扩展名
        if (hasExtension) {
            result = result + extension;
        }
        System.out.println("标准化后: " + result);
        return result;
    }
    public static void main(String[] args) {
        // 测试大小写转换
        convertCase("hello world shell scripting");
        System.out.println();
        convertCase("PYTHON JAVA JAVASCRIPT");
        // 测试文件名标准化
        System.out.println();
        standardizeFilename("My Document FINAL v2.pdf", "snake");
        System.out.println();
        standardizeFilename("user_profile_settings.json", "kebab");
        System.out.println();
        standardizeFilename("HELLO-WORLD-TEST.TXT", "camel");
        System.out.println();
        standardizeFilename("data backup 2024.csv", "pascal");
    }
}

字符串比较与条件判断

字符串比较是 Shell 脚本中常用的条件判断操作。

基础字符串比较

#!/bin/bash

basic_string_comparison() {
    local str1="$1"
    local str2="$2"
    
    echo "比较: '$str1' 和 '$str2'"
    
    # 等于比较
    if [ "$str1" = "$str2" ]; then
        echo "✅ 字符串相等 (使用 =)"
    else
        echo "❌ 字符串不相等 (使用 =)"
    fi
    
    # 不等于比较
    if [ "$str1" != "$str2" ]; then
        echo "✅ 字符串不相等 (使用 !=)"
    else
        echo "❌ 字符串相等 (使用 !=)"
    fi
    
    # 使用双括号(更现代的语法)
    if [[ "$str1" == "$str2" ]]; then
        echo "✅ 字符串相等 (使用 ==)"
    else
        echo "❌ 字符串不相等 (使用 ==)"
    fi
    
    # 长度比较
    if [ ${#str1} -eq ${#str2} ]; then
        echo "✅ 长度相等: ${#str1} = ${#str2}"
    else
        echo "❌ 长度不相等: ${#str1} ≠ ${#str2}"
    fi
    
    # 字典序比较
    if [[ "$str1" < "$str2" ]]; then
        echo "✅ '$str1' 在字典序中小于 '$str2'"
    elif [[ "$str1" > "$str2" ]]; then
        echo "✅ '$str1' 在字典序中大于 '$str2'"
    else
        echo "✅ '$str1' 在字典序中等于 '$str2'"
    fi
}

# 测试
basic_string_comparison "hello" "hello"
echo ""
basic_string_comparison "hello" "world"
echo ""
basic_string_comparison "apple" "banana"

模式匹配比较

#!/bin/bash

pattern_matching() {
    local string="$1"
    local pattern="$2"
    
    echo "检查: '$string' 是否匹配模式 '$pattern'"
    
    # 使用通配符匹配
    if [[ "$string" == $pattern ]]; then
        echo "✅ 匹配成功"
    else
        echo "❌ 匹配失败"
    fi
    
    # 常见模式示例
    echo -e "\n=== 常见模式匹配示例 ==="
    
    # 以特定字符开头
    if [[ "$string" == h* ]]; then
        echo "✅ 以 'h' 开头"
    fi
    
    # 以特定字符结尾
    if [[ "$string" == *d ]]; then
        echo "✅ 以 'd' 结尾"
    fi
    
    # 包含特定子串
    if [[ "$string" == *ll* ]]; then
        echo "✅ 包含 'll'"
    fi
    
    # 特定长度
    if [[ "$string" == ???? ]]; then
        echo "✅ 长度为4"
    fi
    
    # 字符集合
    if [[ "$string" == [aeiou]* ]]; then
        echo "✅ 以元音字母开头"
    fi
}

# 测试
pattern_matching "hello" "h*"
echo ""
pattern_matching "world" "*d"
echo ""
pattern_matching "shell" "*ll*"

实战应用:文件类型检测

#!/bin/bash

detect_file_type() {
    local filename="$1"
    
    echo "检测文件: $filename"
    
    # 获取文件扩展名
    extension="${filename##*.}"
    basename="${filename%.*}"
    
    echo "文件名: $basename"
    echo "扩展名: $extension"
    
    # 根据扩展名判断文件类型
    case "$extension" in
        "txt"|"md"|"log")
            echo "📄 文本文件"
            ;;
        "jpg"|"jpeg"|"png"|"gif"|"bmp")
            echo "🖼️ 图像文件"
            ;;
        "mp3"|"wav"|"ogg"|"flac")
            echo "🎵 音频文件"
            ;;
        "mp4"|"avi"|"mkv"|"mov")
            echo "🎬 视频文件"
            ;;
        "pdf")
            echo "📕 PDF文件"
            ;;
        "zip"|"tar"|"gz"|"rar"|"7z")
            echo "📦 压缩文件"
            ;;
        "sh"|"bash"|"zsh")
            echo "💻 Shell脚本"
            ;;
        "py")
            echo "🐍 Python脚本"
            ;;
        "java")
            echo "☕ Java源码"
            ;;
        "js"|"ts")
            echo "🌐 JavaScript/TypeScript"
            ;;
        "html"|"htm")
            echo "🌐 HTML文件"
            ;;
        "css")
            echo "🎨 CSS样式表"
            ;;
        "json"|"xml"|"yaml"|"yml")
            echo "📊 配置文件"
            ;;
        *)
            echo "❓ 未知文件类型"
            ;;
    esac
    
    # 检查文件名特征
    if [[ "$basename" == *"test"* || "$basename" == *"Test"* ]]; then
        echo "🧪 可能是测试文件"
    fi
    
    if [[ "$basename" == *"backup"* || "$basename" == *"Backup"* ]]; then
        echo "💾 可能是备份文件"
    fi
    
    if [[ "$basename" == *"temp"* || "$basename" == *"Temp"* ]]; then
        echo "🗑️ 可能是临时文件"
    fi
}

# 测试
detect_file_type "document.txt"
echo ""
detect_file_type "photo.jpg"
echo ""
detect_file_type "script.sh"
echo ""
detect_file_type "test_data.json"

Java 对比示例:

public class StringComparison {
    public static void basicStringComparison(String str1, String str2) {
        System.out.println("比较: '" + str1 + "' 和 '" + str2 + "'");
        // 等于比较
        if (str1.equals(str2)) {
            System.out.println("✅ 字符串相等 (使用 equals)");
        } else {
            System.out.println("❌ 字符串不相等 (使用 equals)");
        }
        // 不等于比较
        if (!str1.equals(str2)) {
            System.out.println("✅ 字符串不相等 (使用 !equals)");
        } else {
            System.out.println("❌ 字符串相等 (使用 !equals)");
        }
        // 忽略大小写的比较
        if (str1.equalsIgnoreCase(str2)) {
            System.out.println("✅ 字符串相等 (忽略大小写)");
        }
        // 长度比较
        if (str1.length() == str2.length()) {
            System.out.println("✅ 长度相等: " + str1.length() + " = " + str2.length());
        } else {
            System.out.println("❌ 长度不相等: " + str1.length() + " ≠ " + str2.length());
        }
        // 字典序比较
        int compareResult = str1.compareTo(str2);
        if (compareResult < 0) {
            System.out.println("✅ '" + str1 + "' 在字典序中小于 '" + str2 + "'");
        } else if (compareResult > 0) {
            System.out.println("✅ '" + str1 + "' 在字典序中大于 '" + str2 + "'");
        } else {
            System.out.println("✅ '" + str1 + "' 在字典序中等于 '" + str2 + "'");
        }
    }
    public static void patternMatching(String string, String pattern) {
        System.out.println("检查: '" + string + "' 是否匹配模式 '" + pattern + "'");
        // 使用正则表达式进行模式匹配
        boolean matches = string.matches(pattern);
        if (matches) {
            System.out.println("✅ 匹配成功");
        } else {
            System.out.println("❌ 匹配失败");
        }
        System.out.println("\n=== 常见模式匹配示例 ===");
        // 以特定字符开头
        if (string.startsWith("h")) {
            System.out.println("✅ 以 'h' 开头");
        }
        // 以特定字符结尾
        if (string.endsWith("d")) {
            System.out.println("✅ 以 'd' 结尾");
        }
        // 包含特定子串
        if (string.contains("ll")) {
            System.out.println("✅ 包含 'll'");
        }
        // 特定长度
        if (string.length() == 4) {
            System.out.println("✅ 长度为4");
        }
        // 使用正则表达式匹配元音字母开头
        if (string.matches("[aeiouAEIOU].*")) {
            System.out.println("✅ 以元音字母开头");
        }
    }
    public static void detectFileType(String filename) {
        System.out.println("检测文件: " + filename);
        // 获取文件扩展名
        int lastDot = filename.lastIndexOf('.');
        String basename = lastDot == -1 ? filename : filename.substring(0, lastDot);
        String extension = lastDot == -1 ? "" : filename.substring(lastDot + 1);
        System.out.println("文件名: " + basename);
        System.out.println("扩展名: " + extension);
        // 根据扩展名判断文件类型
        switch (extension.toLowerCase()) {
            case "txt":
            case "md":
            case "log":
                System.out.println("📄 文本文件");
                break;
            case "jpg":
            case "jpeg":
            case "png":
            case "gif":
            case "bmp":
                System.out.println("🖼️ 图像文件");
                break;
            case "mp3":
            case "wav":
            case "ogg":
            case "flac":
                System.out.println("🎵 音频文件");
                break;
            case "mp4":
            case "avi":
            case "mkv":
            case "mov":
                System.out.println("🎬 视频文件");
                break;
            case "pdf":
                System.out.println("📕 PDF文件");
                break;
            case "zip":
            case "tar":
            case "gz":
            case "rar":
            case "7z":
                System.out.println("📦 压缩文件");
                break;
            case "sh":
            case "bash":
            case "zsh":
                System.out.println("💻 Shell脚本");
                break;
            case "py":
                System.out.println("🐍 Python脚本");
                break;
            case "java":
                System.out.println("☕ Java源码");
                break;
            case "js":
            case "ts":
                System.out.println("🌐 JavaScript/TypeScript");
                break;
            case "html":
            case "htm":
                System.out.println("🌐 HTML文件");
                break;
            case "css":
                System.out.println("🎨 CSS样式表");
                break;
            case "json":
            case "xml":
            case "yaml":
            case "yml":
                System.out.println("📊 配置文件");
                break;
            default:
                System.out.println("❓ 未知文件类型");
                break;
        }
        // 检查文件名特征
        if (basename.toLowerCase().contains("test")) {
            System.out.println("🧪 可能是测试文件");
        }
        if (basename.toLowerCase().contains("backup")) {
            System.out.println("💾 可能是备份文件");
        }
        if (basename.toLowerCase().contains("temp")) {
            System.out.println("🗑️ 可能是临时文件");
        }
    }
    public static void main(String[] args) {
        // 测试基础字符串比较
        basicStringComparison("hello", "hello");
        System.out.println();
        basicStringComparison("hello", "world");
        System.out.println();
        basicStringComparison("apple", "banana");
        // 测试模式匹配
        System.out.println();
        patternMatching("hello", "h.*");  // Java使用正则表达式
        System.out.println();
        patternMatching("world", ".*d");
        System.out.println();
        patternMatching("shell", ".*ll.*");
        // 测试文件类型检测
        System.out.println();
        detectFileType("document.txt");
        System.out.println();
        detectFileType("photo.jpg");
        System.out.println();
        detectFileType("script.sh");
        System.out.println();
        detectFileType("test_data.json");
    }
}

字符串格式化与对齐

良好的格式化可以让输出更加美观和易读。

使用 printf 格式化

#!/bin/bash

format_with_printf() {
    echo "=== printf 格式化示例 ==="
    
    # 基本字符串格式化
    printf "姓名: %-15s 年龄: %3d\n" "张三" 25
    printf "姓名: %-15s 年龄: %3d\n" "李四" 30
    printf "姓名: %-15s 年龄: %3d\n" "王五" 28
    
    echo ""
    
    # 数字格式化
    printf "整数: %d\n" 123
    printf "浮点数: %.2f\n" 3.14159
    printf "货币: $%.2f\n" 1234.567
    printf "百分比: %.1f%%\n" 75.5
    
    echo ""
    
    # 对齐和填充
    printf "左对齐: '%-10s'\n" "Hello"
    printf "右对齐: '%10s'\n" "Hello"
    printf "零填充: '%08d'\n" 123
    printf "空格填充: '%8d'\n" 123
    
    echo ""
    
    # 表格格式化
    printf "%-20s %-10s %-10s\n" "产品名称" "价格" "库存"
    printf "%-20s %-10s %-10s\n" "------------------" "------" "------"
    printf "%-20s $%-9.2f %-10d\n" "笔记本电脑" 899.99 15
    printf "%-20s $%-9.2f %-10d\n" "鼠标" 29.99 50
    printf "%-20s $%-9.2f %-10d\n" "键盘" 79.99 25
}

format_with_printf

使用 column 命令创建表格

#!/bin/bash

create_table_with_column() {
    echo "=== 使用 column 命令创建表格 ==="
    
    # 创建制表符分隔的数据
    cat > temp_data.txt << EOF
姓名	年龄	城市	职业
张三	25	北京	工程师
李四	30	上海	设计师
王五	28	广州	产品经理
赵六	35	深圳	架构师
EOF

    # 使用column命令格式化
    echo "基本表格:"
    column -t -s $'\t' temp_data.txt
    
    echo ""
    echo "带分隔线的表格:"
    {
        head -n 1 temp_data.txt
        echo "----	----	----	----"
        tail -n +2 temp_data.txt
    } | column -t -s $'\t'
    
    # 清理临时文件
    rm temp_data.txt
}

create_table_with_column

自定义对齐函数

#!/bin/bash

# 左对齐函数
left_align() {
    local text="$1"
    local width="$2"
    local fill="${3:- }"  # 默认填充字符为空格
    
    local text_length=${#text}
    if [ $text_length -ge $width ]; then
        echo "$text"
    else
        local padding_length=$((width - text_length))
        local padding=$(printf "%${padding_length}s" | tr ' ' "$fill")
        echo -n "$text"
        echo "$padding"
    fi
}

# 右对齐函数
right_align() {
    local text="$1"
    local width="$2"
    local fill="${3:- }"  # 默认填充字符为空格
    
    local text_length=${#text}
    if [ $text_length -ge $width ]; then
        echo "$text"
    else
        local padding_length=$((width - text_length))
        local padding=$(printf "%${padding_length}s" | tr ' ' "$fill")
        echo -n "$padding"
        echo "$text"
    fi
}

# 居中对齐函数
center_align() {
    local text="$1"
    local width="$2"
    local fill="${3:- }"  # 默认填充字符为空格
    
    local text_length=${#text}
    if [ $text_length -ge $width ]; then
        echo "$text"
    else
        local total_padding=$((width - text_length))
        local left_padding=$((total_padding / 2))
        local right_padding=$((total_padding - left_padding))
        
        local left_pad=$(printf "%${left_padding}s" | tr ' ' "$fill")
        local right_pad=$(printf "%${right_padding}s" | tr ' ' "$fill")
        
        echo -n "$left_pad"
        echo -n "$text"
        echo "$right_pad"
    fi
}

# 测试对齐函数
echo "=== 自定义对齐函数测试 ==="
echo "原始文本: 'Hello World'"

echo "左对齐 (宽度20): '$(left_align "Hello World" 20)'"
echo "右对齐 (宽度20): '$(right_align "Hello World" 20)'"
echo "居中对齐 (宽度20): '$(center_align "Hello World" 20)'"

echo "左对齐 (宽度20, 填充*): '$(left_align "Hello World" 20 "*")'"
echo "右对齐 (宽度20, 填充*): '$(right_align "Hello World" 20 "*")'"
echo "居中对齐 (宽度20, 填充*): '$(center_align "Hello World" 20 "*")'"

# 创建美观的菜单
create_beautiful_menu() {
    echo ""
    echo "$(center_align "=== 系统管理菜单 ===" 40 "=")"
    echo "$(left_align "1. 用户管理" 30)$(right_align "按 1 选择" 10)"
    echo "$(left_align "2. 文件管理" 30)$(right_align "按 2 选择" 10)"
    echo "$(left_align "3. 网络配置" 30)$(right_align "按 3 选择" 10)"
    echo "$(left_align "4. 系统监控" 30)$(right_align "按 4 选择" 10)"
    echo "$(left_align "5. 退出系统" 30)$(right_align "按 5 选择" 10)"
    echo "$(center_align "" 40 "=")"
}

create_beautiful_menu

Java 对比示例:

public class StringFormatting {
    public static void formatWithPrintf() {
        System.out.println("=== printf 格式化示例 ===");
        // 基本字符串格式化
        System.out.printf("姓名: %-15s 年龄: %3d%n", "张三", 25);
        System.out.printf("姓名: %-15s 年龄: %3d%n", "李四", 30);
        System.out.printf("姓名: %-15s 年龄: %3d%n", "王五", 28);
        System.out.println();
        // 数字格式化
        System.out.printf("整数: %d%n", 123);
        System.out.printf("浮点数: %.2f%n", 3.14159);
        System.out.printf("货币: $%.2f%n", 1234.567);
        System.out.printf("百分比: %.1f%%%n", 75.5);
        System.out.println();
        // 对齐和填充
        System.out.printf("左对齐: '%-10s'%n", "Hello");
        System.out.printf("右对齐: '%10s'%n", "Hello");
        System.out.printf("零填充: '%08d'%n", 123);
        System.out.printf("空格填充: '%8d'%n", 123);
        System.out.println();
        // 表格格式化
        System.out.printf("%-20s %-10s %-10s%n", "产品名称", "价格", "库存");
        System.out.printf("%-20s %-10s %-10s%n", "------------------", "------", "------");
        System.out.printf("%-20s $%-9.2f %-10d%n", "笔记本电脑", 899.99, 15);
        System.out.printf("%-20s $%-9.2f %-10d%n", "鼠标", 29.99, 50);
        System.out.printf("%-20s $%-9.2f %-10d%n", "键盘", 79.99, 25);
    }
    // 左对齐函数
    public static String leftAlign(String text, int width, char fill) {
        if (text.length() >= width) {
            return text;
        }
        int paddingLength = width - text.length();
        StringBuilder padding = new StringBuilder();
        for (int i = 0; i < paddingLength; i++) {
            padding.append(fill);
        }
        return text + padding.toString();
    }
    // 右对齐函数
    public static String rightAlign(String text, int width, char fill) {
        if (text.length() >= width) {
            return text;
        }
        int paddingLength = width - text.length();
        StringBuilder padding = new StringBuilder();
        for (int i = 0; i < paddingLength; i++) {
            padding.append(fill);
        }
        return padding.toString() + text;
    }
    // 居中对齐函数
    public static String centerAlign(String text, int width, char fill) {
        if (text.length() >= width) {
            return text;
        }
        int totalPadding = width - text.length();
        int leftPadding = totalPadding / 2;
        int rightPadding = totalPadding - leftPadding;
        StringBuilder result = new StringBuilder();
        for (int i = 0; i < leftPadding; i++) {
            result.append(fill);
        }
        result.append(text);
        for (int i = 0; i < rightPadding; i++) {
            result.append(fill);
        }
        return result.toString();
    }
    public static void createBeautifulMenu() {
        System.out.println();
        System.out.println(centerAlign("=== 系统管理菜单 ===", 40, '='));
        System.out.println(leftAlign("1. 用户管理", 30, ' ') + rightAlign("按 1 选择", 10, ' '));
        System.out.println(leftAlign("2. 文件管理", 30, ' ') + rightAlign("按 2 选择", 10, ' '));
        System.out.println(leftAlign("3. 网络配置", 30, ' ') + rightAlign("按 3 选择", 10, ' '));
        System.out.println(leftAlign("4. 系统监控", 30, ' ') + rightAlign("按 4 选择", 10, ' '));
        System.out.println(leftAlign("5. 退出系统", 30, ' ') + rightAlign("按 5 选择", 10, ' '));
        System.out.println(centerAlign("", 40, '='));
    }
    public static void main(String[] args) {
        // 测试printf格式化
        formatWithPrintf();
        // 测试对齐函数
        System.out.println("\n=== 自定义对齐函数测试 ===");
        System.out.println("原始文本: 'Hello World'");
        System.out.println("左对齐 (宽度20): '" + leftAlign("Hello World", 20, ' ') + "'");
        System.out.println("右对齐 (宽度20): '" + rightAlign("Hello World", 20, ' ') + "'");
        System.out.println("居中对齐 (宽度20): '" + centerAlign("Hello World", 20, ' ') + "'");
        System.out.println("左对齐 (宽度20, 填充*): '" + leftAlign("Hello World", 20, '*') + "'");
        System.out.println("右对齐 (宽度20, 填充*): '" + rightAlign("Hello World", 20, '*') + "'");
        System.out.println("居中对齐 (宽度20, 填充*): '" + centerAlign("Hello World", 20, '*') + "'");
        // 创建美观的菜单
        createBeautifulMenu();
    }
}

高级字符串处理技巧

现在让我们探索一些更高级的字符串处理技巧,这些技巧可以解决复杂的实际问题。

多行字符串处理

#!/bin/bash

process_multiline_string() {
    # 创建多行字符串
    local multiline_text="第一行文本
第二行包含特殊字符 !@#$%^&*
第三行数字 123456
第四行混合内容 abc123XYZ
第五行结束文本"
    
    echo "=== 多行字符串处理 ==="
    echo "原始文本:"
    echo "$multiline_text"
    echo ""
    
    # 按行分割处理
    echo "按行处理:"
    line_number=1
    while IFS= read -r line; do
        echo "第${line_number}行: '$line'"
        ((line_number++))
    done <<< "$multiline_text"
    
    echo ""
    
    # 统计行数
    line_count=$(echo "$multiline_text" | wc -l)
    echo "总行数: $line_count"
    
    # 提取特定行
    second_line=$(echo "$multiline_text" | sed -n '2p')
    echo "第二行: $second_line"
    
    # 处理每行(去除空格、转换大小写等)
    echo ""
    echo "处理后的文本:"
    echo "$multiline_text" | while IFS= read -r line; do
        # 去除首尾空格
        trimmed=$(echo "$line" | xargs)
        # 转换为大写
        upper=$(echo "$trimmed" | tr '[:lower:]' '[:upper:]')
        echo "$upper"
    done
}

process_multiline_string

字符串编码转换

#!/bin/bash

handle_encoding() {
    echo "=== 字符串编码处理 ==="
    
    # 创建包含中文的测试文本
    local chinese_text="你好,世界!Hello, World!"
    echo "原始文本: $chinese_text"
    
    # 转换为UTF-8(通常默认就是UTF-8)
    utf8_text=$(echo "$chinese_text" | iconv -f UTF-8 -t UTF-8 2>/dev/null || echo "$chinese_text")
    echo "UTF-8编码: $utf8_text"
    
    # 如果系统支持,可以尝试其他编码转换
    if command -v iconv >/dev/null 2>&1; then
        echo "支持iconv命令,可以进行编码转换"
        
        # 显示可用编码
        echo "部分可用编码:"
        iconv -l | head -5
        
        # 尝试GB2312编码(如果支持)
        if echo "$chinese_text" | iconv -f UTF-8 -t GB2312 >/dev/null 2>&1; then
            gb2312_text=$(echo "$chinese_text" | iconv -f UTF-8 -t GB2312)
            echo "GB2312编码转换成功"
        else
            echo "GB2312编码转换不支持"
        fi
    else
        echo "系统不支持iconv命令"
    fi
    
    # URL编码
    url_encode() {
        local string="${1}"
        local strlen=${#string}
        local encoded=""
        local pos c o
        
        for (( pos=0 ; pos<strlen ; pos++ )); do
            c=${string:$pos:1}
            case "$c" in
                [-_.~a-zA-Z0-9] ) o="${c}" ;;
                * )               printf -v o '%%%02x' "'$c"
            esac
            encoded+="${o}"
        done
        echo "${encoded}"
    }
    
    # URL解码
    url_decode() {
        local url_encoded="${1//+/ }"
        printf '%b' "${url_encoded//%/\\x}"
    }
    
    # 测试URL编码
    url_encoded=$(url_encode "$chinese_text")
    echo "URL编码: $url_encoded"
    
    url_decoded=$(url_decode "$url_encoded")
    echo "URL解码: $url_decoded"
}

handle_encoding

字符串模板引擎

#!/bin/bash

# 简单的模板引擎实现
template_engine() {
    local template="$1"
    shift
    local params=("$@")
    
    echo "=== 模板引擎 ==="
    echo "模板: $template"
    echo "参数: ${params[*]}"
    echo ""
    
    # 替换 {0}, {1}, {2}... 格式的占位符
    result="$template"
    for i in "${!params[@]}"; do
        placeholder="{${i}}"
        result=${result//${placeholder}/${params[i]}}
    done
    
    # 替换 {name} 格式的命名占位符
    for param in "${params[@]}"; do
        if [[ "$param" == *"="* ]]; then
            name="${param%%=*}"
            value="${param#*=}"
            placeholder="{${name}}"
            result=${result//${placeholder}/${value}}
        fi
    done
    
    echo "结果: $result"
    return 0
}

# 测试模板引擎
template_engine "Hello {0}, welcome to {1}!" "John" "Shell Scripting"
echo ""

template_engine "用户 {name} 在 {time} 登录系统,IP地址: {ip}" \
    "name=张三" "time=2024-01-15 14:30:25" "ip=192.168.1.100"
echo ""

template_engine "订单号: {order_id}, 商品: {product}, 数量: {quantity}, 总价: ¥{price}" \
    "order_id=ORD2024001" "product=笔记本电脑" "quantity=1" "price=8999.00"

字符串性能优化技巧

#!/bin/bash

# 性能测试函数
performance_test() {
    local iterations=10000
    echo "=== 字符串操作性能测试 ($iterations 次迭代) ==="
    
    # 测试字符串连接性能
    echo "1. 字符串连接性能测试:"
    
    # 方法1:使用 += 操作符
    start_time=$(date +%s%N)
    result1=""
    for ((i=0; i<iterations; i++)); do
        result1+="a"
    done
    end_time=$(date +%s%N)
    time1=$((end_time - start_time))
    echo "   += 操作符: ${time1:0:-6}ms (长度: ${#result1})"
    
    # 方法2:使用数组然后join
    start_time=$(date +%s%N)
    declare -a arr2
    for ((i=0; i<iterations; i++)); do
        arr2+=("a")
    done
    result2=$(IFS=; echo "${arr2[*]}")
    end_time=$(date +%s%N)
    time2=$((end_time - start_time))
    echo "   数组join: ${time2:0:-6}ms (长度: ${#result2})"
    
    # 测试字符串替换性能
    echo ""
    echo "2. 字符串替换性能测试:"
    test_string=$(printf '%*s' $((iterations/10)) | tr ' ' 'xabcdefghij')
    
    # 方法1:使用 ${var//pattern/replacement}
    start_time=$(date +%s%N)
    result3=${test_string//a/A}
    end_time=$(date +%s%N)
    time3=$((end_time - start_time))
    echo "   ${} 语法: ${time3:0:-6}ms"
    
    # 方法2:使用 sed
    start_time=$(date +%s%N)
    result4=$(echo "$test_string" | sed 's/a/A/g')
    end_time=$(date +%s%N)
    time4=$((end_time - start_time))
    echo "   sed命令: ${time4:0:-6}ms"
    
    # 测试字符串分割性能
    echo ""
    echo "3. 字符串分割性能测试:"
    delimiter_string=$(printf 'a,b,c,d,e,f,g,h,i,j,'%.0s {1..1000})
    
    # 方法1:使用 IFS
    start_time=$(date +%s%N)
    OLD_IFS=$IFS
    IFS=','
    read -ra arr5 <<< "$delimiter_string"
    IFS=$OLD_IFS
    end_time=$(date +%s%N)
    time5=$((end_time - start_time))
    echo "   IFS方法: ${time5:0:-6}ms (数组长度: ${#arr5[@]})"
    
    # 方法2:使用 cut
    start_time=$(date +%s%N)
    arr6=()
    i=1
    while [ $i -le 10000 ]; do
        element=$(echo "$delimiter_string" | cut -d',' -f$i)
        [ -z "$element" ] && break
        arr6+=("$element")
        ((i++))
    done
    end_time=$(date +%s%N)
    time6=$((end_time - start_time))
    echo "   cut命令: ${time6:0:-6}ms (数组长度: ${#arr6[@]})"
}

# 运行性能测试
performance_test

渲染错误: Mermaid 渲染失败: Parse error on line 8: ...组join] D --> H[${} 语法] D --> I[s ----------------------^ Expecting 'SQE', 'DOUBLECIRCLEEND', 'PE', '-)', 'STADIUMEND', 'SUBROUTINEEND', 'PIPE', 'CYLINDEREND', 'DIAMOND_STOP', 'TAGEND', 'TRAPEND', 'INVTRAPEND', 'UNICODE_TEXT', 'TEXT', 'TAGSTART', got 'DIAMOND_START'

Java 对比示例:

import java.net.URLEncoder;
import java.net.URLDecoder;
import java.nio.charset.StandardCharsets;
import java.util.Arrays;
import java.util.HashMap;
import java.util.Map;
public class AdvancedStringHandling {
    public static void processMultilineString() {
        // 创建多行字符串
        String multilineText = "第一行文本\n" +
                              "第二行包含特殊字符 !@#$%^&*\n" +
                              "第三行数字 123456\n" +
                              "第四行混合内容 abc123XYZ\n" +
                              "第五行结束文本";
        System.out.println("=== 多行字符串处理 ===");
        System.out.println("原始文本:");
        System.out.println(multilineText);
        System.out.println();
        // 按行分割处理
        System.out.println("按行处理:");
        String[] lines = multilineText.split("\n");
        for (int i = 0; i < lines.length; i++) {
            System.out.println("第" + (i + 1) + "行: '" + lines[i] + "'");
        }
        System.out.println();
        // 统计行数
        System.out.println("总行数: " + lines.length);
        // 提取特定行
        if (lines.length >= 2) {
            System.out.println("第二行: " + lines[1]);
        }
        // 处理每行(去除空格、转换大小写等)
        System.out.println();
        System.out.println("处理后的文本:");
        for (String line : lines) {
            // 去除首尾空格
            String trimmed = line.trim();
            // 转换为大写
            String upper = trimmed.toUpperCase();
            System.out.println(upper);
        }
    }
    public static void handleEncoding() throws Exception {
        System.out.println("=== 字符串编码处理 ===");
        // 创建包含中文的测试文本
        String chineseText = "你好,世界!Hello, World!";
        System.out.println("原始文本: " + chineseText);
        // UTF-8编码(Java默认)
        byte[] utf8Bytes = chineseText.getBytes(StandardCharsets.UTF_8);
        String utf8Text = new String(utf8Bytes, StandardCharsets.UTF_8);
        System.out.println("UTF-8编码: " + utf8Text);
        // URL编码
        String urlEncoded = URLEncoder.encode(chineseText, StandardCharsets.UTF_8);
        System.out.println("URL编码: " + urlEncoded);
        // URL解码
        String urlDecoded = URLDecoder.decode(urlEncoded, StandardCharsets.UTF_8);
        System.out.println("URL解码: " + urlDecoded);
    }
    // 简单的模板引擎实现
    public static class TemplateEngine {
        public static String process(String template, Object... params) {
            System.out.println("=== 模板引擎 ===");
            System.out.println("模板: " + template);
            System.out.println("参数: " + Arrays.toString(params));
            System.out.println();
            String result = template;
            // 替换 {0}, {1}, {2}... 格式的占位符
            for (int i = 0; i < params.length; i++) {
                if (params[i] instanceof String && ((String)params[i]).contains("=")) {
                    continue; // 跳过命名参数
                }
                String placeholder = "{" + i + "}";
                result = result.replace(placeholder, String.valueOf(params[i]));
            }
            // 替换 {name} 格式的命名占位符
            Map<String, String> namedParams = new HashMap<>();
            for (Object param : params) {
                if (param instanceof String && ((String)param).contains("=")) {
                    String[] parts = ((String)param).split("=", 2);
                    namedParams.put(parts[0], parts[1]);
                }
            }
            for (Map.Entry<String, String> entry : namedParams.entrySet()) {
                String placeholder = "{" + entry.getKey() + "}";
                result = result.replace(placeholder, entry.getValue());
            }
            System.out.println("结果: " + result);
            return result;
        }
    }
    public static void performanceTest() {
        int iterations = 10000;
        System.out.println("=== 字符串操作性能测试 (" + iterations + " 次迭代) ===");
        // 测试字符串连接性能
        System.out.println("1. 字符串连接性能测试:");
        // 方法1:使用 += 操作符
        long startTime = System.nanoTime();
        String result1 = "";
        for (int i = 0; i < iterations; i++) {
            result1 += "a";
        }
        long endTime = System.nanoTime();
        long time1 = (endTime - startTime) / 1000000;
        System.out.println("   += 操作符: " + time1 + "ms (长度: " + result1.length() + ")");
        // 方法2:使用 StringBuilder
        startTime = System.nanoTime();
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < iterations; i++) {
            sb.append("a");
        }
        String result2 = sb.toString();
        endTime = System.nanoTime();
        long time2 = (endTime - startTime) / 1000000;
        System.out.println("   StringBuilder: " + time2 + "ms (长度: " + result2.length() + ")");
        // 测试字符串替换性能
        System.out.println();
        System.out.println("2. 字符串替换性能测试:");
        char[] testChars = new char[iterations / 10];
        Arrays.fill(testChars, 'x');
        for (int i = 0; i < testChars.length; i++) {
            testChars[i] = "abcdefghij".charAt(i % 10);
        }
        String testString = new String(testChars);
        // 方法1:使用 replaceAll
        startTime = System.nanoTime();
        String result3 = testString.replaceAll("a", "A");
        endTime = System.nanoTime();
        long time3 = (endTime - startTime) / 1000000;
        System.out.println("   replaceAll: " + time3 + "ms");
        // 方法2:使用 StringBuilder 手动替换
        startTime = System.nanoTime();
        StringBuilder sb2 = new StringBuilder();
        for (char c : testString.toCharArray()) {
            sb2.append(c == 'a' ? 'A' : c);
        }
        String result4 = sb2.toString();
        endTime = System.nanoTime();
        long time4 = (endTime - startTime) / 1000000;
        System.out.println("   StringBuilder手动: " + time4 + "ms");
        // 测试字符串分割性能
        System.out.println();
        System.out.println("3. 字符串分割性能测试:");
        StringBuilder delimiterBuilder = new StringBuilder();
        for (int i = 0; i < 1000; i++) {
            delimiterBuilder.append("a,b,c,d,e,f,g,h,i,j,");
        }
        String delimiterString = delimiterBuilder.toString();
        // 方法1:使用 split
        startTime = System.nanoTime();
        String[] arr5 = delimiterString.split(",");
        endTime = System.nanoTime();
        long time5 = (endTime - startTime) / 1000000;
        System.out.println("   split方法: " + time5 + "ms (数组长度: " + arr5.length + ")");
        // 方法2:使用 StringTokenizer(传统方法)
        startTime = System.nanoTime();
        java.util.StringTokenizer tokenizer = new java.util.StringTokenizer(delimiterString, ",");
        String[] arr6 = new String[tokenizer.countTokens()];
        int index = 0;
        while (tokenizer.hasMoreTokens()) {
            arr6[index++] = tokenizer.nextToken();
        }
        endTime = System.nanoTime();
        long time6 = (endTime - startTime) / 1000000;
        System.out.println("   StringTokenizer: " + time6 + "ms (数组长度: " + arr6.length + ")");
    }
    public static void main(String[] args) throws Exception {
        // 测试多行字符串处理
        processMultilineString();
        System.out.println();
        // 测试编码处理
        handleEncoding();
        System.out.println();
        // 测试模板引擎
        TemplateEngine.process("Hello {0}, welcome to {1}!", "John", "Java Programming");
        System.out.println();
        TemplateEngine.process("用户 {name} 在 {time} 登录系统,IP地址: {ip}",
            "name=张三", "time=2024-01-15 14:30:25", "ip=192.168.1.100");
        System.out.println();
        TemplateEngine.process("订单号: {order_id}, 商品: {product}, 数量: {quantity}, 总价: ¥{price}",
            "order_id=ORD2024001", "product=笔记本电脑", "quantity=1", "price=8999.00");
        System.out.println();
        // 运行性能测试
        performanceTest();

以上就是Shell脚本中字符串处理的实用技巧分享的详细内容,更多关于Shell脚本字符串处理技巧的资料请关注脚本之家其它相关文章!

您可能感兴趣的文章:
阅读全文