- Refactored batch and mini_batch training to use a common gradient computation function (removed duplicate code). - Altered the cost computation so that NAN is not computed unnecessarily. - Greatly simplified (and sped up) the code that appends a column of 1s to the data. - Minor code cleanup. Removed unused variables. Added cast to float to remove warning