Database Administrators Asked by Velocity on October 28, 2021
I have a table named data with a column pid & display_data.
Pid is the serial number of all rows.
The value of display_data column is as below :
For each row display_Data has values in the format =>
For PID= 1:
The labels are constant for all rows which would become column headers. Values are changing with respect to labels.
There are total 400 such rows hence, the query needs to be dynamic.
I want to parse this column value,strip html tags and into a new view split it using sql query(Regexp_Replace maybe) such that:
All label values become columns, i.e.
The actual data is much more with many labels and values however this is just sample to get help.
Convert HTML to XML and parse with XMLTABLE()
HTML and XML are both descendants of SGML.
From the sample data you have posted, it looks like you just need to remove the <br>
tags and wrap desc_data
in a root element (eg <row>
) for the conversion into XML.
Once in XML format, you can parse the data with XMLTABLE()
to give you:
<span>
( via for ordinality
, column name RN
)@label
<span>
This is done within the CTE named xml_data
in my example.
Assuming value
always follows label
, you can convert the instance of <span>
( RN
) to have the same value for each pair. (this is also done in xml_data
)
From there, you can PIVOT
the data into a table that has:
RID
)label
(LABEL
is a key word so I used LABEL_TXT
)value
(VALUE
is a key word so I used VALUE_TXT
)This is done within the CTE labeled key_val
in my example.
For the final query, the final columns names MUST BE KNOWN at compile time. Once you know the columns, you can just do another PIVOT
for a proper table.
As you can see, HTML is not a good data format to use for parsing data and/or using the data within a computer (eg a database).
Unless desc_data
is used for display, I recommend that the data is stored in XML or JSON format. (I'm assuming you can change the source)
Once data is in a parseable format, PL/SQL is no longer needed.
with data as (
select q'[<span class ="label">Project Name</span><br> <span class ="value">Digital</span> <span class ="label">Dept Name</span><br> <span class ="value">Finance</span><span class ="label">Level</span><br> <span class ="value">Level 5</span>]'
as txt
,1 AS RID
from dual
),xml_data as (
select RID, a.rn - mod(a.rn+1,2) rn,a.label_txt, a.val
from data d, xmltable( '/row/span'
passing xmltype( '<row>' || replace(d.txt,'<br>','') || '</row>' )
columns
rn for ordinality,
label_txt varchar2(20) path '/span/@class',
val varchar2(20) path '/span'
) a
), key_val as (
select rid, label_txt, value_txt
from xml_data
pivot (
min( val )
for label_txt in ( 'label' as LABEL_TXT, 'value' as VALUE_TXT)
)
)
select RID, PNAME, LVL, DNAME
from key_val
pivot (
min( VALUE_TXT )
for LABEL_TXT in ('Project Name' as PNAME, 'Level' as LVL, 'Dept Name' as DNAME )
)
Answered by Michael Kutz on October 28, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP