utf 8 - Perl - Convert utf-8 char to hyphen - read utf-8 as single char -
i new perl. have requirement, have convert utf-8 characters in string hyphen(-).
input string - "ivm ist 20150324095652 31610150096 10ÑatÑ25Ñdisco 0000000091" expected output - "ivm ist 20150324095652 31610150096 10-at-25-disco 0000000091".
but below program have written, reads utf-8 char 2 separate bytes , getting output "10--at--25--disco"
[root@ cdr]# cat ../asciifilter.pl #!/usr/bin/perl use strict; use encode; @chars; $character; $num; while(my $row = <>) { @chars = split(//,$row); foreach $character (@chars) { $num = ord($character); if($num < 127) { print $character; } else { print "-"; } } }
output:
[root@mavbgl-351l cdr]# echo "ivm ist 20150324095652 31610150096 10ÑatÑ25Ñdisco 0000000091" | ../asciifilter.pl ivm ist 20150324095652 31610150096 10--at--25--disco 0000000091
but particular 4th string column has fixed length of 14 characters only.so additional hyphens creating problem.
can give me clue on how read utf-8 char single character ?
the main thing need perl -csd
. that, script can simple as
perl -csd -pe 's/[^\x00-\x7f]/-/g'
see man perlrun discussion of options; briefly, -cs
means stdin
, stdout
, , stderr
in utf-8; , -cd
means utf-8 default perlio layer both input , output streams. (this script uses stdin
, stdout
d
isn't strictly necessary; if learn 1 magic incantation, learn -csd
.)
Comments
Post a Comment