sql - How to count words frequency in Teradata -
for example if have 1000 rows of data has customer id (e.g. 123) , comments on our product (e.g. great product easy use)
how use teradata (version 15) word frequency count output has 2 columns 1 word , other frequency e.g. (great: 20, product: 10)?
thank you
you use strtok_split_to_table
pull off.
something following:
select d.token, sum(d.outkey) table (strtok_split_to_table(1, <yourtable>.<yourcommentsfield>, ' ') returns (outkey integer, tokennum integer, token varchar(20)character set unicode) ) d group 1
this split each word in comments field individual records, counts occurrence of each word. stick own <yourtable>.<yourcommentsfield>
in there , should go.
more information on strtok_split_to_table: http://www.info.teradata.com/htmlpubs/db_ttu_14_00/index.html#page/sql_reference/b035_1145_111a/string_ops_funcs.084.242.html
here sql , results test on system:
create set table db.testcloud ,no fallback , no before journal, no after journal, checksum = default, default mergeblockratio ( customer varchar(10) character set latin not casespecific, comments varchar(1000) character set latin not casespecific) primary index ( customer ); insert testcloud (1, 'this test comment'); insert testcloud (2, 'this comment of something'); select d.token, sum(d.outkey) table (td_sysfnlib.strtok_split_to_table(1, testcloud.comments, ' -/') returns (outkey integer, tokennum integer, token varchar(20)character set unicode) ) d group 1 --token sum(outkey) --is 2 --also 1 --this 2 --of 1 --test 1 --a 1 --comment 2 --something 1
Comments
Post a Comment