You need to use the BUFIO/BUFR on the clock, and all the rest FCO, DATA lines on the same bank.
This allows you to use the SERDES with maximum IO bandwidth on these devices.
At the output of the SERDES, transfer the data to the GCLK.
You can try the core generator to give you a VHDL/Verilog template and then customize it from there.