utf 8 - Perl - Convert utf-8 char to hyphen - read utf-8 as single char -


i new perl. have requirement, have convert utf-8 characters in string hyphen(-).

input string    - "ivm ist   20150324095652 31610150096     10ÑatÑ25Ñdisco 0000000091" expected output - "ivm ist   20150324095652 31610150096     10-at-25-disco 0000000091". 

but below program have written, reads utf-8 char 2 separate bytes , getting output "10--at--25--disco"

[root@ cdr]# cat ../asciifilter.pl #!/usr/bin/perl use strict; use encode; @chars; $character; $num; while(my $row = <>) {   @chars = split(//,$row);    foreach $character (@chars) {     $num  = ord($character);     if($num < 127) {        print $character;     } else {        print "-";     }   } } 

output:

  [root@mavbgl-351l cdr]# echo "ivm ist   20150324095652 31610150096     10ÑatÑ25Ñdisco 0000000091" | ../asciifilter.pl   ivm ist   20150324095652 31610150096     10--at--25--disco 0000000091 

but particular 4th string column has fixed length of 14 characters only.so additional hyphens creating problem.

can give me clue on how read utf-8 char single character ?

the main thing need perl -csd. that, script can simple as

perl -csd -pe 's/[^\x00-\x7f]/-/g' 

see man perlrun discussion of options; briefly, -cs means stdin, stdout, , stderr in utf-8; , -cd means utf-8 default perlio layer both input , output streams. (this script uses stdin , stdout d isn't strictly necessary; if learn 1 magic incantation, learn -csd.)


Comments

Popular posts from this blog

Payment information shows nothing in one page checkout page magento -

tcpdump - How to check if server received packet (acknowledged) -