sql - How to count words frequency in Teradata -


for example if have 1000 rows of data has customer id (e.g. 123) , comments on our product (e.g. great product easy use)

how use teradata (version 15) word frequency count output has 2 columns 1 word , other frequency e.g. (great: 20, product: 10)?

thank you

you use strtok_split_to_table pull off.

something following:

select d.token, sum(d.outkey) table (strtok_split_to_table(1, <yourtable>.<yourcommentsfield>, ' ')         returns (outkey integer, tokennum integer, token varchar(20)character set unicode) ) d  group 1 

this split each word in comments field individual records, counts occurrence of each word. stick own <yourtable>.<yourcommentsfield> in there , should go.

more information on strtok_split_to_table: http://www.info.teradata.com/htmlpubs/db_ttu_14_00/index.html#page/sql_reference/b035_1145_111a/string_ops_funcs.084.242.html

here sql , results test on system:

create set table db.testcloud ,no fallback ,      no before journal,      no after journal,      checksum = default,      default mergeblockratio      (       customer varchar(10) character set latin not casespecific,       comments varchar(1000) character set latin not casespecific) primary index ( customer );   insert testcloud (1, 'this test comment'); insert testcloud (2, 'this comment of something');  select d.token, sum(d.outkey) table (td_sysfnlib.strtok_split_to_table(1, testcloud.comments, ' -/')         returns (outkey integer, tokennum integer, token varchar(20)character set unicode) ) d  group 1  --token sum(outkey) --is    2 --also  1 --this  2 --of    1 --test  1 --a 1 --comment   2 --something 1 

Comments

Popular posts from this blog

Payment information shows nothing in one page checkout page magento -

tcpdump - How to check if server received packet (acknowledged) -