{"id":11102,"date":"2021-12-10T23:33:14","date_gmt":"2021-12-10T20:33:14","guid":{"rendered":"https:\/\/kifarunix.com\/?p=11102"},"modified":"2024-03-18T07:46:18","modified_gmt":"2024-03-18T04:46:18","slug":"how-to-get-character-encoding-of-a-file-in-linux","status":"publish","type":"post","link":"https:\/\/kifarunix.com\/how-to-get-character-encoding-of-a-file-in-linux\/","title":{"rendered":"How to get character encoding of a file in Linux"},"content":{"rendered":"\n<p>Are you trying to get character encoding of a file in Linux? Well, follow through this guide to learn some simple ways that you can use to find or get <a href=\"https:\/\/en.wikipedia.org\/wiki\/Character_encoding\" target=\"_blank\" rel=\"noreferrer noopener\">character encoding<\/a> of file in Linux.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Getting character encoding of a file in Linux<\/h2>\n\n\n\n<p>In Linux, there a number of commands that you can use to get character encoding of a file.<\/p>\n\n\n\n<p>Such commands include:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>file<\/li>\n\n\n\n<li>encguess<\/li>\n\n\n\n<li>NPM dfeal<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Get character encoding of a file using <strong><code>file<\/code><\/strong> command in Linux<\/h3>\n\n\n\n<p><strong><code>file<\/code><\/strong> is a command in Linux that is used to determine other file types. It can as well be used to determine or get the character encoding of files.<\/p>\n\n\n\n<p>Assuming you have a file, <strong><code>file.txt<\/code><\/strong>, if you want to get its character encoding, run the command below;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>file file.txt<\/code><\/pre>\n\n\n\n<p>Sample output;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>file.txt: <strong>UTF-8<\/strong> Unicode text<\/code><\/pre>\n\n\n\n<p>From the output, the character encoding of the file.txt is <code><strong>UTF-8<\/strong><\/code>.<\/p>\n\n\n\n<p>You can also pass option <strong><code>-i\/--mime<\/code><\/strong> to print the mime type strings such as <strong><code>text\/plain; charset=us-ascii<\/code><\/strong> rather than <strong><code>ASCII text<\/code><\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>file -i file.txt<\/code><\/pre>\n\n\n\n<p>Sample output;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>file.txt: text\/plain; charset=<strong>utf-8<\/strong><\/code><\/pre>\n\n\n\n<p>If you want to omit filenames from the command output, use option <code><strong>-b\/--brief<\/strong><\/code>.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>file -ib file.txt<\/code><\/pre>\n\n\n\n<p>Sample output;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>text\/plain; charset=<strong>utf-8<\/strong><\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Get character encoding of a file using <strong><code>encguess<\/code><\/strong> command in Linux<\/h3>\n\n\n\n<p><code>encguess<\/code> is a command provided by the perl (Debian\/Ubuntu) or perl\/perl-Encode (RHEL based) package that can be used to guess character encodings of files.<\/p>\n\n\n\n<p>The command line syntax;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>encguess &#91;options] filename<\/code><\/pre>\n\n\n\n<p>To use an example of my file above, file.txt;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>encguess file.txt<\/code><\/pre>\n\n\n\n<p>Sample output;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>file.txt\tUTF-8<\/code><\/pre>\n\n\n\n<p>Read more on man pages, <strong><code>man encguess<\/code><\/strong>.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Get character encoding of a file using <code><strong>dfeal<\/strong><\/code> command in Linux<\/h3>\n\n\n\n<p><strong><code>dfeal (detect-file-encoding-and-language)<\/code><\/strong>&nbsp;<em>is an NPM command that is used determine the encoding and language of text files.<\/em><\/p>\n\n\n\n<p>To install <strong><code>detect-file-encoding-and-language<\/code><\/strong>, you first need to install NPM;<\/p>\n\n\n\n<p>Ubuntu\/Debian;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>sudo apt install nodejs npm -y<\/code><\/pre>\n\n\n\n<p>RHEL based distros, <a href=\"https:\/\/nodejs.org\/en\/download\/package-manager\/#centos-fedora-and-red-hat-enterprise-linux\" target=\"_blank\" rel=\"noreferrer noopener\">see how to install NPM<\/a>.<\/p>\n\n\n\n<p>Next, install dfeal command;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>sudo npm install -g detect-file-encoding-and-language<\/code><\/pre>\n\n\n\n<p>Getting the character encoding;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>dfeal file.txt<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code>{\n    \"encoding\": \"<strong>UTF-8<\/strong>\",\n    \"language\": \"spanish\",\n    \"confidence\": {\n        \"encoding\": 1,\n        \"language\": 0.02\n    }\n}<\/code><\/pre>\n\n\n\n<p>There could be more commands to get the character encoding for a file in Linux. Leave them in the comment section.<\/p>\n\n\n\n<p>That marks the end of our guide on how to character encoding of a file in Linux.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Other Tutorials<\/h3>\n\n\n\n<p><a href=\"https:\/\/kifarunix.com\/install-cheat-command-on-ubuntu-20-04\/\" target=\"_blank\" rel=\"noreferrer noopener\">Install Cheat Command on Ubuntu 20.04<\/a><\/p>\n\n\n\n<p><a href=\"https:\/\/kifarunix.com\/example-usage-of-ps-command-in-linux\/\" target=\"_blank\" rel=\"noreferrer noopener\">Example Usage of ps Command in Linux<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Are you trying to get character encoding of a file in Linux? Well, follow through this guide to learn some simple ways that you can<\/p>\n","protected":false},"author":1,"featured_media":11113,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"rank_math_lock_modified_date":false,"footnotes":""},"categories":[121],"tags":[4341,4344,4350,4351,4349,4342,4343,4352],"class_list":["post-11102","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-howtos","tag-auto-detect-text-file-encoding","tag-character-encoding-of-a-file","tag-dfeal-command","tag-encguess","tag-file-command-character-encoding","tag-find-file-character-encoding-in-linux","tag-linux-get-file-character-encoding","tag-uchardet","generate-columns","tablet-grid-50","mobile-grid-100","grid-parent","grid-50","resize-featured-image"],"_links":{"self":[{"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/posts\/11102"}],"collection":[{"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/comments?post=11102"}],"version-history":[{"count":6,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/posts\/11102\/revisions"}],"predecessor-version":[{"id":21588,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/posts\/11102\/revisions\/21588"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/media\/11113"}],"wp:attachment":[{"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/media?parent=11102"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/categories?post=11102"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/tags?post=11102"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}