Lurch
Lurch

Reputation: 875

Remove ANSI color codes from a text file using bash

I have a bash script that runs and outputs to a text file however the colour codes it uses are also included what i'd like to know is how to remove them from the file, ie

^[[38;1;32mHello^[[39m
^[[38;1;31mUser^[[39m

so I just want to be left with Hello and User

Upvotes: 27

Views: 17198

Answers (5)

Andrés
Andrés

Reputation: 66

TL;DR

Here is a more general answer to remove any valid CSI (Control Sequence Introducer) sequence:

LC_ALL=C.UTF8 sed -E "s/\x1B\[[\x30-\x3F]*[\x20-\x20F]*[\x40-\x7E]//g"

Explanation

The sed regex \x1B\[[\x30-\x3F]*[\x20-\x20F]*[\x40-\x7E] matches the Wikipedia description of a CSI sequence:

  • \x1B\[: starts with ESC [
  • [\x30-\x3F]*: any number (including none) of "parameter bytes" in the range 0x30–0x3F (ASCII 0–9:;<=>?)
  • [\x20-\x20F]*: any number of "intermediate bytes" in the range 0x20–0x2F (ASCII space and !"#$%&'()*+,-./)
  • [\x40-\x7E]: ends with a single "final byte" in the range 0x40–0x7E (ASCII @A–Z[\]^_a–z{|}~`)

Character sorting is locale-dependent. For example, if you use en_US.UTF8 you will get the following error:

sed: -e expression #1, char 47: Invalid range end

To avoid this, add LC_ALL=C.UTF8 at the beginning of the command (this will change the value of LC_ALL only for the sed command, following commands are unaffected)

Upvotes: 0

CypherX
CypherX

Reputation: 7353

Solution

The mostly voted answer did not work for me straight out of the box. It needed a small tweak.

HowTo:

  • Run the following in a bash shell or add the following code block to you existing list of aliases to be able to reuse the decolor alias in future.
## Decolor ANSI Colored Output
# example: (see preview in VSCode editor)
# >>> cat <filepath> | decolor | code -
alias decolor.styles='sed -E "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})*)?[m,K,H,f,J]//gm"'
alias decolor.reset='sed -E "s/\x1B\([A-Z]{1}(\x1B\[[m,K,H,f,J])?//gm"'
alias decolor='decolor.styles | decolor.reset'

Usage:

cat coloredtext.txt | decolor

Output:

PRESENT: /somepath/somefile_a.csv
PRESENT: /somepath/somefile_b.csv

Dummy Data

# File Name: coloredtext.txt
# [1m[32mPRESENT:(B[m [32m/somepath/somefile_a.csv(B[m
# [1m[32mPRESENT:(B[m [32m/somepath/somefile_b.csv(B[m
\x1B[1m\x1B[32mPRESENT:\x1B(B\x1B[m \x1B[32m/somepath/somefile_a.csv\x1B(B\x1B[m
\x1B[1m\x1B[32mPRESENT:\x1B(B\x1B[m \x1B[32m/somepath/somefile_b.csv\x1B(B\x1B[m

Upvotes: 0

Wiimm
Wiimm

Reputation: 3517

My solution:

... | sed $'s/\e\\[[0-9;:]*[a-zA-Z]//g'

The colon is there to support escapes for some old terminal types.

Upvotes: 11

sandy_1111
sandy_1111

Reputation: 383

sed -r "s/\x1B\[(([0-9]{1,2})?(;)?([0-9]{1,2})?)?[m,K,H,f,J]//g" file_name

this command removes the special characters and color codes from the file

these are some of ANSI codes: ESC[#;#H or ESC[#;#f moves cursor to line #, column # ESC[2J clear screen and home cursor ESC[K clear to end of line,

note in case of clear code there is neither number nor semicolon ;

agree with below comment: if the numbers are more than 2 digit kindly use this:

sed -r "s/\x1B\[(([0-9]+)(;[0-9]+)*)?[m,K,H,f,J]//g" filename

Upvotes: 23

MacUsers
MacUsers

Reputation: 2229

Does this solve the issue?

$ echo "^[[38;1;32mHello^[[39m" | sed -e 's/\^\[\[[0-9;]\{2,\}m//g'
Hello

cheers!!

Upvotes: 1

Related Questions