Data Science Asked by J.Smith on May 28, 2021
I want to extract each tree so that I can feed it with any data, and see the output.
dump_list=xg_clas.get_booster().get_dump()
num_t=len(dump_list)
print("Number of Trees=",num_t)
I can find number of trees like this,
xgb.plot_tree(xg_clas, num_trees=0)
plt.rcParams['figure.figsize']=[50, 10]
plt.show()
graph each tree like this. When I do something like:
dump_list[0]
it gives me the tree as a text. But I couldn’t find any way to extract a tree as an object, and use it.
https://github.com/dmlc/xgboost/issues/117#ref-commit-3f6ff43
I found this but didn’t really understand what is suggested.
Progress: I tried to somehow turn
dump_list[0]
string object into a sklearn DecisionTreeClassifier object. Still no luck.
I uploaded my notebook if you want to check it out: https://github.com/sciencelove11/Question
This is an open feature request (at time of writing):
https://github.com/dmlc/xgboost/issues/2175
https://github.com/dmlc/xgboost/issues/3439
There, a very wasteful but working solution is mentioned: predict using ntree_limit
for each number of trees of interest. I've put together a quick demonstration Colab notebook here.
It also has been asked several times over at SO, see e.g.
https://stackoverflow.com/questions/51681714/extract-trees-and-weights-from-trained-xgboost-model
https://stackoverflow.com/questions/37677496/how-to-get-access-of-individual-trees-of-a-xgboost-model-in-python-r
and their Related questions.
In the first link, another workaround is mentioned: by dumping to text/PMML, you should be able to reload each individual tree (or subsets thereof) and make the predictions. It's not clear how to make this work though: XGB itself doesn't have an easy way to load a model except from its own binary format. You might be able to do it by parsing the output (JSON seems most promising) into another library with tree models.
Correct answer by Ben Reiniger on May 28, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP