Word transcription of TIMIT dataset

How can word-level instead of phoneme-level speech recognition be done with the TIMIT dataset?
I build and train models. On the other hand, I have only phoneme transcription. I want word transcription of audio files. Would you help me?