Due to being embedded inside the CPU The TCM has a Harvard-architecture, so there is an ITCM (instruction TCM) and a DTCM (data TCM). The DTCM can not contain any instructions, but the ITCM can actually contain data. The size of DTCM or ITCM is minimum 4KiB so the typical minimum configuration is 4KiB ITCM and 4KiB DTCM.
It looks like tcm have same purpose as cache memory.
No. They didn't used the word cache in explanation
A cache uses access patterns to populate data within the cache. It has extra hardware to track the backing address and may have communication with other system entities (SMP) to track when a cache line is dirty (someone else has written something to primary memory).
The 'TCM' (tightly coupled memory) is fast, probably SRAM multi-transistor memory, like the cache. Both have a fast dedicated connection to the CPU. However, the overhead to implement the TCM is far less than a cache. Typically TCM is found on lower-end (deeply embedded probably Cortex-M) ARM devices.
Most CPU caches have a lock down feature which enables them to behave like the TCM. However, the TCM does not have on the fly capabilities to buffer high use code and data. Because of this, the TCM (and locked cache) is probably more deterministic which may help hard real time applications.