如题有些下载的文本的编码格式很奇怪。通过file命令看到:
file systeminfo.txt
Non-ISO extended-ASCII text, with very long lines, with CRLF line terminators
假如作者搞了个奇奇怪怪的编码格式,你都不知道。通过一个脚本来发现它:
cat code.sh
#!/bin/bash
iconv --list | sed 's/\/\/$//' | sort > encodings.list
for a in `cat encodings.list`; do
printf "$a "
iconv -f $a -t UTF-8 $1 > /dev/null 2>&1 \
&& echo "ok: $a" || echo "fail: $a"
done | tee result.txt
grep GB result.txt
执行: ./code.sh systeminfo.txt
查看result.txt,关注GB开头的就可以了:
执行转换命令:
iconv -f GB18030 -t UTF-8 systeminfo.txt > 2222.txt
# file 2222.txt
2222.txt: UTF-8 Unicode text, with very long lines, with CRLF line terminators