- Refactored batch and mini_batch training to use a common gradient computation function (removed duplicate code).
- Altered the cost computation so that NAN is not computed unnecessarily.
- Greatly simplified (and sped up) the code that appends a column of 1s to the data.
- Minor code cleanup.
Removed unused variables.
Added cast to float to remove warning