My Collections

Question

I am trying to write a bash script for testing that takes a parameter and sends it through curl to web site. I need to url encode the value to make sure that special characters are processed properly. What is the best way to do this?

Here is my basic script so far:

#!/bin/bash  host=${1:?'bad host'}  value=$2  shift  shift  curl -v -d "param=${value}" http://${host}/somepath $@

x-yuri · Accepted Answer · 2010-01-08 13:05:40Z

Use curl --data-urlencode; from man curl:

This posts data, similar to the other --data options with the exception that this performs URL-encoding. To be CGI-compliant, the <data> part should begin with a name followed by a separator and a content specification.

Example usage:

curl \      --data-urlencode "paramName=value" \      --data-urlencode "secondParam=value" \      http://example.com

See the man page for more info.

This requires curl 7.18.0 or newer (released January 2008). Use curl -V to check which version you have.

You can as well encode the query string:

curl -G \      --data-urlencode "p1=value 1" \      --data-urlencode "p2=value 2" \      http://example.com

Orwellophile · Answer 2 · 2012-05-18 22:58:11Z

Here is the pure BASH answer.

rawurlencode() {    local string="${1}"    local strlen=${#string}    local encoded=""    local pos c o      for (( pos=0 ; pos<strlen ; pos++ )); do       c=${string:$pos:1}       case "$c" in          [-_.~a-zA-Z0-9] ) o="${c}" ;;          * )               printf -v o '%%%02x' "'$c"       esac       encoded+="${o}"    done    echo "${encoded}"        REPLY="${encoded}"     }

You can use it in two ways:

easier:  echo http://url/q?=$( rawurlencode "$args" )  faster:  rawurlencode "$args"; echo http://url/q?${REPLY}

[edited]

Here's the matching rawurldecode() function, which - with all modesty - is awesome.

    rawurldecode() {                    printf -v REPLY '%b' "${1//%/\\x}"       echo "${REPLY}"    }

With the matching set, we can now perform some simple tests:

$ diff rawurlencode.inc.sh \          <( rawurldecode "$( rawurlencode "$( cat rawurlencode.inc.sh )" )" ) \          && echo Matched    Output: Matched

And if you really really feel that you need an external tool (well, it will go a lot faster, and might do binary files and such...) I found this on my OpenWRT router...

replace_value=$(echo $replace_value | sed -f /usr/lib/ddns/url_escape.sed)

Where url_escape.sed was a file that contained these rules:

  s:%:%25:g  s: :%20:g  s:<:%3C:g  s:>:%3E:g  s:  s:{:%7B:g  s:}:%7D:g  s:|:%7C:g  s:\\:%5C:g  s:\^:%5E:g  s:~:%7E:g  s:\[:%5B:g  s:\]:%5D:g  s:`:%60:g  s:;:%3B:g  s:/:%2F:g  s:?:%3F:g  s^:^%3A^g  s:@:%40:g  s:=:%3D:g  s:&:%26:g  s:\$:%24:g  s:\!:%21:g  s:\*:%2A:g

dubek · Answer 3 · 2008-11-18 09:34:45Z

Use Perl's URI::Escape module and uri_escape function in the second line of your bash script:

...    value="$(perl -MURI::Escape -e 'print uri_escape($ARGV[0]);' "$2")"  ...

Edit: Fix quoting problems, as suggested by Chris Johnsen in the comments. Thanks!

nisetama · Answer 4 · 2015-12-22 02:33:08Z

Another option is to use jq (as a filter):

jq -sRr @uri

-R (--raw-input) treats input lines as strings instead of parsing them as JSON and -sR (--slurp --raw-input) reads the input into a single string. -r (--raw-output) outputs the contents of strings instead of JSON string literals.

If the input is not the output of another command, you can store it in a jq string variable:

jq -nr --arg v "my shell string" '$v|@uri'

-n (--null-input) does not read input, and --arg name value stores value in variable name as a string. In the filter, $name (in single quotes, to avoid expansion by the shell), references the variable name.

Wrapped as a Bash function, this becomes:

function uriencode { jq -nr --arg v "$1" '$v|@uri'; }

Or this percent-encodes all bytes:

xxd -p|tr -d \\n|sed 's/../%&/g'

josch · Answer 5 · 2011-09-21 21:10:50Z

for the sake of completeness, many solutions using sed or awk only translate a special set of characters and are hence quite large by code size and also dont translate other special characters that should be encoded.

a safe way to urlencode would be to just encode every single byte - even those that would've been allowed.

echo -ne 'some random\nbytes' | xxd -plain | tr -d '\n' | sed 's/\(..\)/%\1/g'

xxd is taking care here that the input is handled as bytes and not characters.

edit:

xxd comes with the vim-common package in Debian and I was just on a system where it was not installed and I didnt want to install it. The altornative is to use hexdump from the bsdmainutils package in Debian. According to the following graph, bsdmainutils and vim-common should have an about equal likelihood to be installed:

http://qa.debian.org/popcon-png.php?packages=vim-common%2Cbsdmainutils&show_installed=1&want_legend=1&want_ticks=1

but nevertheless here a version which uses hexdump instead of xxd and allows to avoid the tr call:

echo -ne 'some random\nbytes' | hexdump -v -e '/1 "%02x"' | sed 's/\(..\)/%\1/g'

Bruno Bronosky · Answer 6 · 2012-05-29 11:11:23Z

One of variants, may be ugly, but simple:

urlencode() {      local data      if [[ $# != 1 ]]; then          echo "Usage: $0 string-to-urlencode"          return 1      fi      data="$(curl -s -o /dev/null -w %{url_effective} --get --data-urlencode "$1" "")"      if [[ $? != 3 ]]; then          echo "Unexpected error" 1>&2          return 2      fi      echo "${data##/?}"      return 0  }

Here is the one-liner version for example (as suggested by Bruno):

date | curl -Gso /dev/null -w %{url_effective} --data-urlencode @- "" | cut -c 3-      date | curl -Gso /dev/null -w %{url_effective} --data-urlencode @- "" | sed -E 's/..(.*).../\1/'

sandro · Answer 7 · 2010-02-10 10:26:10Z

I find it more readable in python:

encoded_value=$(python -c "import urllib; print urllib.quote('''$value''')")

the triple ' ensures that single quotes in value won't hurt. urllib is in the standard library. It work for exampple for this crazy (real world) url:

"http://www.rai.it/dl/audio/" "1264165523944Ho servito il re d'Inghilterra - Puntata 7

MDMower · Answer 8 · 2009-11-10 19:48:37Z

I've found the following snippet useful to stick it into a chain of program calls, where URI::Escape might not be installed:

perl -p -e 's/([^A-Za-z0-9])/sprintf("%%%02X", ord($1))/seg'

(source)

MDMower

59166 silver badges2222 bronze badges

answered Nov 10 '09 at 19:48

blueyedblueyed

24k44 gold badges7070 silver badges6363 bronze badges

Piotr Czapla · Answer 9 · 2011-02-25 12:37:16Z

If you wish to run GET request and use pure curl just add --get to @Jacob's solution.

Here is an example:

curl -v --get --data-urlencode "access_token=$(cat .fb_access_token)" https://graph.facebook.com/me/feed

chenzhiwei · Answer 10 · 2013-08-01 09:14:56Z

This may be the best one:

after=$(echo -e "$before" | od -An -tx1 | tr ' ' % | xargs printf "%s")

answered Aug 1 '13 at 9:14

chenzhiweichenzhiwei

40144 silver badges1414 bronze badges

Pokechu22 · Answer 11 · 2008-11-30 21:42:37Z

Direct link to awk version : http://www.shelldorado.com/scripts/cmds/urlencode
I used it for years and it works like a charm

:                                                                                                PN=`basename "$0"`            VER='1.4'    : ${AWK=awk}    Usage () {      echo >&2 "$PN - encode URL data, $VER  usage: $PN [-l] [file ...]      -l:  encode line endings (result will be one line of output)    The default is to encode each input line on its own."      exit 1  }    Msg () {      for MsgLine      do echo "$PN: $MsgLine" >&2      done  }    Fatal () { Msg "$@"; exit 1; }    set -- `getopt hl "$@" 2>/dev/null` || Usage  [ $# -lt 1 ] && Usage               EncodeEOL=no  while [ $# -gt 0 ]  do      case "$1" in          -l) EncodeEOL=yes;;      --) shift; break;;      -h) Usage;;      -*) Usage;;      *)  break;;               esac      shift  done    LANG=C  export LANG  $AWK '      BEGIN {      # We assume an awk implementation that is just plain dumb.      # We will convert an character to its ASCII value with the      # table ord[], and produce two-digit hexadecimal output      # without the printf("%02X") feature.        EOL = "%0A"     # "end of line" string (encoded)      split ("1 2 3 4 5 6 7 8 9 A B C D E F", hextab, " ")      hextab [0] = 0      for ( i=1; i<=255; ++i ) ord [ sprintf ("%c", i) "" ] = i + 0      if ("'"$EncodeEOL"'" == "yes") EncodeEOL = 1; else EncodeEOL = 0      }      {      encoded = ""      for ( i=1; i<=length ($0); ++i ) {          c = substr ($0, i, 1)          if ( c ~ /[a-zA-Z0-9.-]/ ) {          encoded = encoded c     # safe character          } else if ( c == " " ) {          encoded = encoded "+"   # special handling          } else {          # unsafe character, encode it as a two-digit hex-number          lo = ord [c] % 16          hi = int (ord [c] / 16);          encoded = encoded "%" hextab [hi] hextab [lo]          }      }      if ( EncodeEOL ) {          printf ("%s", encoded EOL)      } else {          print encoded      }      }      END {          #if ( EncodeEOL ) print ""      }  ' "$@"

davidchambers · Answer 12 · 2017-01-01 02:44:09Z

Here's a Bash solution which doesn't invoke any external programs:

uriencode() {    s="${1//'%'/%25}"    s="${s//' '/%20}"    s="${s//'"'/%22}"    s="${s//'#'/%23}"    s="${s//'$'/%24}"    s="${s//'&'/%26}"    s="${s//'+'/%2B}"    s="${s//','/%2C}"    s="${s//'/'/%2F}"    s="${s//':'/%3A}"    s="${s//';'/%3B}"    s="${s//'='/%3D}"    s="${s//'?'/%3F}"    s="${s//'@'/%40}"    s="${s//'['/%5B}"    s="${s//']'/%5D}"    printf %s "$s"  }

Cody Gray · Answer 13 · 2011-01-11 12:51:29Z

url=$(echo "$1" | sed -e 's/%/%25/g' -e 's/ /%20/g' -e 's/!/%21/g' -e 's/"/%22/g' -e 's/#/%23/g' -e 's/\$/%24/g' -e 's/\&/%26/g' -e 's/'\''/%27/g' -e 's/(/%28/g' -e 's/)/%29/g' -e 's/\*/%2a/g' -e 's/+/%2b/g' -e 's/,/%2c/g' -e 's/-/%2d/g' -e 's/\./%2e/g' -e 's/\//%2f/g' -e 's/:/%3a/g' -e 's/;/%3b/g' -e 's//%3e/g' -e 's/?/%3f/g' -e 's/@/%40/g' -e 's/\[/%5b/g' -e 's/\\/%5c/g' -e 's/\]/%5d/g' -e 's/\^/%5e/g' -e 's/_/%5f/g' -e 's/`/%60/g' -e 's/{/%7b/g' -e 's/|/%7c/g' -e 's/}/%7d/g' -e 's/~/%7e/g')

this will encode the string inside of $1 and output it in $url. although you don't have to put it in a var if you want. BTW didn't include the sed for tab thought it would turn it into spaces

ataylor · Answer 14 · 2010-06-20 00:22:08Z

For those of you looking for a solution that doesn't need perl, here is one that only needs hexdump and awk:

url_encode() {   [ $# -lt 1 ] && { return; }     encodedurl="$1";        [ ! -x "/usr/bin/hexdump" ] && { return; }     encodedurl=`     echo $encodedurl | hexdump -v -e '1/1 "%02x\t"' -e '1/1 "%_c\n"' |     LANG=C awk '       $1 == "20"                    { printf("%s",   "+"); next } # space becomes plus       $1 ~  /0[adAD]/               {                      next } # strip newlines       $2 ~  /^[a-zA-Z0-9.*()\/-]$/  { printf("%s",   $2);  next } # pass through what we can                                     { printf("%%%s", $1)        } # take hex value of everything else     '`  }

Stitched together from a couple of places across the net and some local trial and error. It works great!

kev · Answer 15 · 2012-11-26 08:48:57Z

uni2ascii is very handy:

$ echo -ne '你好世界' | uni2ascii -aJ  %E4%BD%A0%E5%A5%BD%E4%B8%96%E7%95%8C

answered Nov 26 '12 at 8:48

kevkev

130k3434 gold badges233233 silver badges253253 bronze badges

thecoshman · Answer 16 · 2008-11-17 19:42:51Z

If you don't want to depend on Perl you can also use sed. It's a bit messy, as each character has to be escaped individually. Make a file with the following contents and call it urlencode.sed

s/%/%25/g  s/ /%20/g  s/ /%09/g  s/!/%21/g  s/"/%22/g  s/#/%23/g  s/\$/%24/g  s/\&/%26/g  s/'\''/%27/g  s/(/%28/g  s/)/%29/g  s/\*/%2a/g  s/+/%2b/g  s/,/%2c/g  s/-/%2d/g  s/\./%2e/g  s/\//%2f/g  s/:/%3a/g  s/;/%3b/g  s//%3e/g  s/?/%3f/g  s/@/%40/g  s/\[/%5b/g  s/\\/%5c/g  s/\]/%5d/g  s/\^/%5e/g  s/_/%5f/g  s/`/%60/g  s/{/%7b/g  s/|/%7c/g  s/}/%7d/g  s/~/%7e/g  s/      /%09/g

To use it do the following.

STR1=$(echo "https://www.example.com/change&$ ^this to?%checkthe@-functionality" | cut -d\? -f1)  STR2=$(echo "https://www.example.com/change&$ ^this to?%checkthe@-functionality" | cut -d\? -f2)  OUT2=$(echo "$STR2" | sed -f urlencode.sed)  echo "$STR1?$OUT2"

This will split the string into a part that needs encoding, and the part that is fine, encode the part that needs it, then stitches back together.

You can put that into a sh script for convenience, maybe have it take a parameter to encode, put it on your path and then you can just call:

urlencode https://www.exxample.com?isThisFun=HellNo

_source

Klaus · Answer 17 · 2015-01-20 21:08:51Z

You can emulate javascript's encodeURIComponent in perl. Here's the command:

perl -pe 's/([^a-zA-Z0-9_.!~*()'\''-])/sprintf("%%%02X", ord($1))/ge'

You could set this as a bash alias in .bash_profile:

alias encodeURIComponent='perl -pe '\''s/([^a-zA-Z0-9_.!~*()'\''\'\'''\''-])/sprintf("%%%02X",ord($1))/ge'\'

Now you can pipe into encodeURIComponent:

$ echo -n 'hèllo wôrld!' | encodeURIComponent  h%C3%A8llo%20w%C3%B4rld!

davidchambers · Answer 18 · 2014-07-07 00:04:58Z

Here's the node version:

uriencode() {    node -p "encodeURIComponent('${1//\'/\\\'}')"  }

answered Jul 7 '14 at 0:04

davidchambersdavidchambers

19.9k1414 gold badges6666 silver badges9393 bronze badges

Dylan · Answer 19 · 2014-11-14 11:55:29Z

The question is about doing this in bash and there's no need for python or perl as there is in fact a single command that does exactly what you want - "urlencode".

value=$(urlencode "${2}")

This is also much better, as the above perl answer, for example, doesn't encode all characters correctly. Try it with the long dash you get from Word and you get the wrong encoding.

Note, you need "gridsite-clients" installed to provide this command.

Marcus Müller · Answer 20 · 2015-02-26 04:40:31Z

Simple PHP option:

echo 'part-that-needs-encoding' | php -R 'echo urlencode($argn);'

Marcus Müller

26.7k44 gold badges3535 silver badges7676 bronze badges

answered Feb 26 '15 at 4:40

RyanRyan

3,43311 gold badge2121 silver badges3232 bronze badges

k107 · Answer 21 · 2012-06-19 23:45:26Z

Ruby, for completeness

value="$(ruby -r cgi -e 'puts CGI.escape(ARGV[0])' "$2")"

answered Jun 19 '12 at 23:45

k107k107

12.9k66 gold badges5252 silver badges5454 bronze badges

Answer 22 · 2013-12-18 09:13:49Z

Another php approach:

echo "encode me" | php -r "echo urlencode(file_get_contents('php://stdin'));"

answered Dec 18 '13 at 9:13

Steven Penny · Answer 23 · 2016-12-31 05:14:25Z

Here is a POSIX function to do that:

encodeURIComponent() {    awk 'BEGIN {while (y++ < 125) z[sprintf("%c", y)] = y    while (y = substr(ARGV[1], ++j, 1))    q = y ~ /[[:alnum:]_.!~*\47()-]/ ? q y : q sprintf("%%%02X", z[y])    print q}' "$1"  }

Example:

value=$(encodeURIComponent "$2")

Source

nulleight · Answer 24 · 2017-01-31 10:45:24Z

Here is my version for busybox ash shell for an embedded system, I originally adopted Orwellophile's variant:

urlencode()  {      local S="${1}"      local encoded=""      local ch      local o      for i in $(seq 0 $((${#S} - 1)) )      do          ch=${S:$i:1}          case "${ch}" in              [-_.~a-zA-Z0-9])                   o="${ch}"                  ;;              *)                   o=$(printf '%%%02x' "'$ch")                                  ;;          esac          encoded="${encoded}${o}"      done      echo ${encoded}  }    urldecode()   {            local url_encoded="${1//+/ }"      printf '%b' "${url_encoded//%/\\x}"  }

Stuart P. Bentley · Answer 25 · 2011-04-29 18:32:33Z

Here's a one-line conversion using Lua, similar to blueyed's answer except with all the RFC 3986 Unreserved Characters left unencoded (like this answer):

url=$(echo 'print((arg[1]:gsub("([^%w%-%.%_%~])",function(c)return("%%%02X"):format(c:byte())end)))' | lua - "$1")

Additionally, you may need to ensure that newlines in your string are converted from LF to CRLF, in which case you can insert a gsub("\r?\n", "\r\n") in the chain before the percent-encoding.

Here's a variant that, in the non-standard style of application/x-www-form-urlencoded, does that newline normalization, as well as encoding spaces as '+' instead of '%20' (which could probably be added to the Perl snippet using a similar technique).

url=$(echo 'print((arg[1]:gsub("\r?\n", "\r\n"):gsub("([^%w%-%.%_%~ ]))",function(c)return("%%%02X"):format(c:byte())end):gsub(" ","+"))' | lua - "$1")

ajaest · Answer 26 · 2013-08-15 12:04:33Z

Having php installed I use this way:

URL_ENCODED_DATA=`php -r "echo urlencode('$DATA');"`

answered Aug 15 '13 at 12:04

ajaestajaest

47533 silver badges1313 bronze badges

Answer 27 · 2013-12-25 14:07:12Z

This is the ksh version of orwellophile's answer containing the rawurlencode and rawurldecode functions (link: How to urlencode data for curl command?). I don't have enough rep to post a comment, hence the new post..

#!/bin/ksh93    function rawurlencode  {      typeset string="${1}"      typeset strlen=${#string}      typeset encoded=""        for (( pos=0 ; pos<strlen ; pos++ )); do          c=${string:$pos:1}          case "$c" in              [-_.~a-zA-Z0-9] ) o="${c}" ;;              * )               o=$(printf '%%%02x' "'$c")          esac          encoded+="${o}"      done      print "${encoded}"  }    function rawurldecode  {      printf $(printf '%b' "${1//%/\\x}")  }    print $(rawurlencode "C++")       print $(rawurldecode "C%2b%2b")

Answer 28 · 2017-04-12 17:42:15Z

What would parse URLs better than javascript?

node -p "encodeURIComponent('$url')"

answered Apr 12 '17 at 17:42

dash-o · Answer 29 · 2020-06-28 08:00:18Z

There is an excellent answer from Orwellophile, which does include a pure bash option (function rawurlencode), which I've used on my website (shell based CGI script, large number of URLS in response to search requests). The only draw back was high CPU during peak time.

I've found a modified solution, leverage bash "global replace" feature. With this solution processing time for url encode is 4X faster. The solution identify the characters to be escaped, and uses the "global replace" operator (${var//source/replacement}) to process all substitutions. The speed up is clearly from using bash internal loops, over explicit loop.

Performance: On core i3-8100 3.60Ghz. Test case: 1000 URL from stack overflow, similar to this ticket: "https://stackoverflow.com/questions/296536/how-to-urlencode-data-for-curl-command".

Existing Solution: 0.807 sec
Optimized Solution: 0.162 sec (5X speedup)

url_encode()  {      local key="${1}" varname="${2:-_rval}" prefix="${3:-_ENCKEY_}"      local unsafe=${key//[-_.~a-zA-Z0-9 ]/}       local -i key_len=${#unsafe}      local ch ch1 ch0        while [ "$unsafe" ] ;do          ch=${unsafe:0:1}          ch0="\\$ch"          printf -v ch1 '%%%02x' "'$ch'"           key=${key//$ch0/"$ch1"}          unsafe=${unsafe//"$ch0"}      done      key=${key// /+}         REPLY="$key"            return 0  }

As a minor extra, it uses '+' to encode the space. Slightly more compact URL.

Benchmark:

function t {      local key      for (( i=1 ; i<=$1 ; i++ )) do url_encode "$2" kkk2 ; done      echo "K=$REPLY"  }    t 1000 "https://stackoverflow.com/questions/296536/how-to-urlencode-data-for-curl-command"

My Collections

Thursday, October 22, 2020

How to urlencode data for curl command?

How to urlencode data for curl command?

Not the answer you're looking for? Browse other questions tagged bash shell curl scripting urlencode or ask your own question.

Upgrade to Premium Plan