Single clock distribution over a large high performance chip can be very challenging. This led to evolution of globally asynchronous and locally Synchronous (GALS) systems in modern deep sub-micron (DSM) technology. In GALS mostly bundled data protocols which are based on handshake mechanism, are used for data transfer. But these protocols rely on timing assumptions between handshake signals and data values that causes timing closure problems, which poses strict constraints in system-on-chip (SoC) design. This work leverages quasi delay insensitive (QDI) designs to propose GALS design templates. This will facilitate the use of GALS systems in a conventional digital design flow with minimal intervention to interfacing modules. Modifications for two different quasi delay insensitive (QDI) asynchronous designs have been suggested, implemented and verified by using the proposed templates. Power, energy and latency have been compared for two different interfaces.