We assessed the performance of large language models’ summarizing clinical dialogues using computational metrics and human evaluations. The comparison was done between automatically generated and ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results