Dear authors, first of all, thank you for the amazing work.
As in the title, I'm struggling to reproduce exactly the results you obtained in Fig.7 that I report here for the sake of clarity.
I executed the code as you reported in the README, taking the average over three seeds using the RL-trained checkpoint you provided. Could you provide me with some guidance?
Best,
Matteo
Dear authors, first of all, thank you for the amazing work.
As in the title, I'm struggling to reproduce exactly the results you obtained in Fig.7 that I report here for the sake of clarity.
I executed the code as you reported in the README, taking the average over three seeds using the RL-trained checkpoint you provided. Could you provide me with some guidance?
Best,
Matteo