{"id":1353,"date":"2012-11-20T20:13:14","date_gmt":"2012-11-20T20:13:14","guid":{"rendered":"http:\/\/www.robertprice.co.uk\/robblog\/?p=1353"},"modified":"2012-11-20T20:13:14","modified_gmt":"2012-11-20T20:13:14","slug":"utf-8-aware-cron-scripts","status":"publish","type":"post","link":"http:\/\/www.robertprice.co.uk\/robblog\/utf-8-aware-cron-scripts\/","title":{"rendered":"UTF-8 Aware Cron Scripts"},"content":{"rendered":"<p>I&#8217;ve recently been having a spot of bother with <a href=\"http:\/\/en.wikipedia.org\/wiki\/UTF-8\">UTF-8<\/a> data in a <a href=\"http:\/\/www.perl.org\/\">Perl<\/a> script on an old linux box.<\/p>\n<p>Specifically, I have been importing data from a RESTful service that includes the name Michael Bubl\u00e9. That accented e at the end of Michael&#8217;s name has been problematic.<\/p>\n<p>When I run my code from the command line, it imports correctly into my system, however, when run as a cron job, it imports as Michael Bubl\u00c3\u00a9. The \u00e9 is a multibyte character, but the script was trying to read it seperate characters and getting into a muddle.<\/p>\n<p>At first I assumed the service I was consuming had change the encoding, but running via the command line showed no problems. The problem was down a difference between the command line and cron environments.<\/p>\n<p>Checking the locale using the <code>locale<\/code> command I got this on the command line&#8230;<\/p>\n<pre>\nLANG=en_US.UTF-8\nLC_CTYPE=\"en_US.UTF-8\"\nLC_NUMERIC=\"en_US.UTF-8\"\nLC_TIME=\"en_US.UTF-8\"\nLC_COLLATE=\"en_US.UTF-8\"\nLC_MONETARY=\"en_US.UTF-8\"\nLC_MESSAGES=\"en_US.UTF-8\"\nLC_PAPER=\"en_US.UTF-8\"\nLC_NAME=\"en_US.UTF-8\"\nLC_ADDRESS=\"en_US.UTF-8\"\nLC_TELEPHONE=\"en_US.UTF-8\"\nLC_MEASUREMENT=\"en_US.UTF-8\"\nLC_IDENTIFICATION=\"en_US.UTF-8\"\nLC_ALL=\n<\/pre>\n<p>&#8230; but when running that command as a cron job and piping the results to a file in \/tmp, I got the following&#8230;<\/p>\n<pre>\nLANG=\nLC_CTYPE=\"POSIX\"\nLC_NUMERIC=\"POSIX\"\nLC_TIME=\"POSIX\"\nLC_COLLATE=\"POSIX\"\nLC_MONETARY=\"POSIX\"\nLC_MESSAGES=\"POSIX\"\nLC_PAPER=\"POSIX\"\nLC_NAME=\"POSIX\"\nLC_ADDRESS=\"POSIX\"\nLC_TELEPHONE=\"POSIX\"\nLC_MEASUREMENT=\"POSIX\"\nLC_IDENTIFICATION=\"POSIX\"\nLC_ALL=\n<\/pre>\n<p>Cron jobs were being executed that weren&#8217;t UTF-8 aware. The solution was to set the <var>LANG<\/var> in the <var>\/etc\/environment<\/var> file like this&#8230;<\/p>\n<pre>\nLANG=en_US.UTF-8\n<\/pre>\n<p>&#8230; then restart the cron daemon using<\/p>\n<pre>\n\/etc\/rc.d\/init.d\/crond restart\n<\/pre>\n<p>Now my scripts can successfully import multibyte UTF-8 data correctly when run on the command line or as a cron job.<\/p>\n<p>The <a href=\"http:\/\/publib.boulder.ibm.com\/infocenter\/pseries\/v5r3\/topic\/com.ibm.aix.baseadmn\/doc\/baseadmndita\/etc_env_file.htm?resultof=%22%2f%65%74%63%2f%65%6e%76%69%72%6f%6e%6d%65%6e%74%22%20\"><var>\/etc\/environment<\/var><\/a> file is used to set variables that specify the basic environment for all processes so should be the best place to set the <var>lANG<\/var> variable.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I&#8217;ve recently been having a spot of bother with UTF-8 data in a Perl script on an old linux box. Specifically, I have been importing data from a RESTful service that includes the name Michael Bubl\u00e9. That accented e at the end of Michael&#8217;s name has been problematic. When I run my code from the &hellip; <a href=\"http:\/\/www.robertprice.co.uk\/robblog\/utf-8-aware-cron-scripts\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;UTF-8 Aware Cron Scripts&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":""},"categories":[2],"tags":[47,71],"class_list":["post-1353","post","type-post","status-publish","format-standard","hentry","category-dev","tag-perl","tag-utf8"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.8 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>UTF-8 Aware Cron Scripts - Robert Price<\/title>\n<meta name=\"description\" content=\"Your script is UTF-8 aware when run on the command line, but not when run as a cronjob? Robert Price has the solution.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"http:\/\/www.robertprice.co.uk\/robblog\/utf-8-aware-cron-scripts\/\" \/>\n<meta property=\"og:locale\" content=\"en_GB\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"UTF-8 Aware Cron Scripts - Robert Price\" \/>\n<meta property=\"og:description\" content=\"Your script is UTF-8 aware when run on the command line, but not when run as a cronjob? Robert Price has the solution.\" \/>\n<meta property=\"og:url\" content=\"http:\/\/www.robertprice.co.uk\/robblog\/utf-8-aware-cron-scripts\/\" \/>\n<meta property=\"og:site_name\" content=\"Robert Price\" \/>\n<meta property=\"article:published_time\" content=\"2012-11-20T20:13:14+00:00\" \/>\n<meta name=\"author\" content=\"rob\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"rob\" \/>\n\t<meta name=\"twitter:label2\" content=\"Estimated reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"http:\/\/www.robertprice.co.uk\/robblog\/utf-8-aware-cron-scripts\/#article\",\"isPartOf\":{\"@id\":\"http:\/\/www.robertprice.co.uk\/robblog\/utf-8-aware-cron-scripts\/\"},\"author\":{\"name\":\"rob\",\"@id\":\"http:\/\/www.robertprice.co.uk\/robblog\/#\/schema\/person\/fac6d5b076e0e14e1fb13e15b542a6c5\"},\"headline\":\"UTF-8 Aware Cron Scripts\",\"datePublished\":\"2012-11-20T20:13:14+00:00\",\"mainEntityOfPage\":{\"@id\":\"http:\/\/www.robertprice.co.uk\/robblog\/utf-8-aware-cron-scripts\/\"},\"wordCount\":250,\"keywords\":[\"Perl\",\"utf8\"],\"articleSection\":[\"Dev\"],\"inLanguage\":\"en-GB\"},{\"@type\":\"WebPage\",\"@id\":\"http:\/\/www.robertprice.co.uk\/robblog\/utf-8-aware-cron-scripts\/\",\"url\":\"http:\/\/www.robertprice.co.uk\/robblog\/utf-8-aware-cron-scripts\/\",\"name\":\"UTF-8 Aware Cron Scripts - Robert Price\",\"isPartOf\":{\"@id\":\"http:\/\/www.robertprice.co.uk\/robblog\/#website\"},\"datePublished\":\"2012-11-20T20:13:14+00:00\",\"author\":{\"@id\":\"http:\/\/www.robertprice.co.uk\/robblog\/#\/schema\/person\/fac6d5b076e0e14e1fb13e15b542a6c5\"},\"description\":\"Your script is UTF-8 aware when run on the command line, but not when run as a cronjob? Robert Price has the solution.\",\"breadcrumb\":{\"@id\":\"http:\/\/www.robertprice.co.uk\/robblog\/utf-8-aware-cron-scripts\/#breadcrumb\"},\"inLanguage\":\"en-GB\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"http:\/\/www.robertprice.co.uk\/robblog\/utf-8-aware-cron-scripts\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"http:\/\/www.robertprice.co.uk\/robblog\/utf-8-aware-cron-scripts\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"http:\/\/www.robertprice.co.uk\/robblog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"UTF-8 Aware Cron Scripts\"}]},{\"@type\":\"WebSite\",\"@id\":\"http:\/\/www.robertprice.co.uk\/robblog\/#website\",\"url\":\"http:\/\/www.robertprice.co.uk\/robblog\/\",\"name\":\"Robert Price\",\"description\":\"\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"http:\/\/www.robertprice.co.uk\/robblog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-GB\"},{\"@type\":\"Person\",\"@id\":\"http:\/\/www.robertprice.co.uk\/robblog\/#\/schema\/person\/fac6d5b076e0e14e1fb13e15b542a6c5\",\"name\":\"rob\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-GB\",\"@id\":\"http:\/\/www.robertprice.co.uk\/robblog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/6f0eb511179100a4e968abc70403e33686e6ab3e992e392bedd2ccac01da666c?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/6f0eb511179100a4e968abc70403e33686e6ab3e992e392bedd2ccac01da666c?s=96&d=mm&r=g\",\"caption\":\"rob\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"UTF-8 Aware Cron Scripts - Robert Price","description":"Your script is UTF-8 aware when run on the command line, but not when run as a cronjob? Robert Price has the solution.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"http:\/\/www.robertprice.co.uk\/robblog\/utf-8-aware-cron-scripts\/","og_locale":"en_GB","og_type":"article","og_title":"UTF-8 Aware Cron Scripts - Robert Price","og_description":"Your script is UTF-8 aware when run on the command line, but not when run as a cronjob? Robert Price has the solution.","og_url":"http:\/\/www.robertprice.co.uk\/robblog\/utf-8-aware-cron-scripts\/","og_site_name":"Robert Price","article_published_time":"2012-11-20T20:13:14+00:00","author":"rob","twitter_card":"summary_large_image","twitter_misc":{"Written by":"rob","Estimated reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"http:\/\/www.robertprice.co.uk\/robblog\/utf-8-aware-cron-scripts\/#article","isPartOf":{"@id":"http:\/\/www.robertprice.co.uk\/robblog\/utf-8-aware-cron-scripts\/"},"author":{"name":"rob","@id":"http:\/\/www.robertprice.co.uk\/robblog\/#\/schema\/person\/fac6d5b076e0e14e1fb13e15b542a6c5"},"headline":"UTF-8 Aware Cron Scripts","datePublished":"2012-11-20T20:13:14+00:00","mainEntityOfPage":{"@id":"http:\/\/www.robertprice.co.uk\/robblog\/utf-8-aware-cron-scripts\/"},"wordCount":250,"keywords":["Perl","utf8"],"articleSection":["Dev"],"inLanguage":"en-GB"},{"@type":"WebPage","@id":"http:\/\/www.robertprice.co.uk\/robblog\/utf-8-aware-cron-scripts\/","url":"http:\/\/www.robertprice.co.uk\/robblog\/utf-8-aware-cron-scripts\/","name":"UTF-8 Aware Cron Scripts - Robert Price","isPartOf":{"@id":"http:\/\/www.robertprice.co.uk\/robblog\/#website"},"datePublished":"2012-11-20T20:13:14+00:00","author":{"@id":"http:\/\/www.robertprice.co.uk\/robblog\/#\/schema\/person\/fac6d5b076e0e14e1fb13e15b542a6c5"},"description":"Your script is UTF-8 aware when run on the command line, but not when run as a cronjob? Robert Price has the solution.","breadcrumb":{"@id":"http:\/\/www.robertprice.co.uk\/robblog\/utf-8-aware-cron-scripts\/#breadcrumb"},"inLanguage":"en-GB","potentialAction":[{"@type":"ReadAction","target":["http:\/\/www.robertprice.co.uk\/robblog\/utf-8-aware-cron-scripts\/"]}]},{"@type":"BreadcrumbList","@id":"http:\/\/www.robertprice.co.uk\/robblog\/utf-8-aware-cron-scripts\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"http:\/\/www.robertprice.co.uk\/robblog\/"},{"@type":"ListItem","position":2,"name":"UTF-8 Aware Cron Scripts"}]},{"@type":"WebSite","@id":"http:\/\/www.robertprice.co.uk\/robblog\/#website","url":"http:\/\/www.robertprice.co.uk\/robblog\/","name":"Robert Price","description":"","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"http:\/\/www.robertprice.co.uk\/robblog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-GB"},{"@type":"Person","@id":"http:\/\/www.robertprice.co.uk\/robblog\/#\/schema\/person\/fac6d5b076e0e14e1fb13e15b542a6c5","name":"rob","image":{"@type":"ImageObject","inLanguage":"en-GB","@id":"http:\/\/www.robertprice.co.uk\/robblog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/6f0eb511179100a4e968abc70403e33686e6ab3e992e392bedd2ccac01da666c?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/6f0eb511179100a4e968abc70403e33686e6ab3e992e392bedd2ccac01da666c?s=96&d=mm&r=g","caption":"rob"}}]}},"_links":{"self":[{"href":"http:\/\/www.robertprice.co.uk\/robblog\/wp-json\/wp\/v2\/posts\/1353","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.robertprice.co.uk\/robblog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.robertprice.co.uk\/robblog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.robertprice.co.uk\/robblog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.robertprice.co.uk\/robblog\/wp-json\/wp\/v2\/comments?post=1353"}],"version-history":[{"count":0,"href":"http:\/\/www.robertprice.co.uk\/robblog\/wp-json\/wp\/v2\/posts\/1353\/revisions"}],"wp:attachment":[{"href":"http:\/\/www.robertprice.co.uk\/robblog\/wp-json\/wp\/v2\/media?parent=1353"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.robertprice.co.uk\/robblog\/wp-json\/wp\/v2\/categories?post=1353"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.robertprice.co.uk\/robblog\/wp-json\/wp\/v2\/tags?post=1353"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}