The use of molecular simulation to estimate the strength of macromolecular binding free energies is becoming increasingly widespread, with goals ranging from lead optimization and enrichment in drug discovery to personalizing or stratifying treatment regimes. In order to realize the potential of such approaches to predict new results, not merely to explain previous experimental findings, it is necessary that the methods used are reliable and accurate, and that their limitations are thoroughly understood. However, the computational cost of atomistic simulation techniques such as molecular dynamics (MD) has meant that until recently little work has focused on validating and verifying the available free energy methodologies, with the consequence that many of the results published in the literature are not reproducible. Here, we present a detailed analysis of two of the most popular approximate methods for calculating binding free energies from molecular simulations, molecular mechanics Poisson-Boltzmann surface area (MMPBSA) and molecular mechanics generalized Born surface area (MMGBSA), applied to the nine FDA-approved HIV-1 protease inhibitors. Our results show that the values obtained from replica simulations of the same protease-drug complex, differing only in initially assigned atom velocities, can vary by as much as 10 kcal/mol, which is greater than the difference between the best and worst binding inhibitors under investigation. Despite this, analysis of ensembles of simulations producing 50 trajectories of 4 ns duration leads to well converged free energy estimates. For seven inhibitors, we find that with correctly converged normal mode estimates of the configurational entropy, we can correctly distinguish inhibitors in agreement with experimental data for both the MMPBSA and MMGBSA methods and thus have the ability to rank the efficacy of binding of this selection of drugs to the protease (no account is made for free energy penalties associated with protein distortion leading to the over estimation of the binding strength of the two largest inhibitors ritonavir and atazanavir). We obtain improved rankings and estimates of the relative binding strengths of the drugs by using a novel combination of MMPBSA/MMGBSA with normal mode entropy estimates and the free energy of association calculated directly from simulation trajectories. Our work provides a thorough assessment of what is required to produce converged and hence reliable free energies for protein-ligand binding. The use of molecular simulation to estimate the strength of macromolecular binding free energies is becoming increasingly widespread, with goals ranging from lead optimization and enrichment in drug discovery to personalizing or stratifying treatment regimes. In order to realize the potential of such approaches to predict new results, not merely to explain previous experimental findings, it is necessary that the methods used are reliable and accurate, and that their limitations are thoroughly understood. However, the computational cost of atomistic simulation techniques such as molecular dynamics (MD) has meant that until recently little work has focused on validating and verifying the available free energy methodologies, with the consequence that many of the results published in the literature are not reproducible. Here, we present a detailed analysis of two of the most popular approximate methods for calculating binding free energies from molecular simulations, molecular mechanics Poisson-Boltzmann surface area (MMPBSA) and molecular mechanics generalized Born surface area (MMGBSA), applied to the nine FDA-approved HIV-1 protease inhibitors. Our results show that the values obtained from replica simulations of the same protease-drug complex, differing only in initially assigned atom velocities, can vary by as much as 10 kcal/mol, which is greater than the difference between the best and worst binding inhibitors under investigation. Despite this, analysis of ensembles of simulations producing 50 trajectories of 4 ns duration leads to well converged free energy estimates. For seven inhibitors, we find that with correctly converged normal mode estimates of the configurational entropy, we can correctly distinguish inhibitors in agreement with experimental data for both the MMPBSA and MMGBSA methods and thus have the ability to rank the efficacy of binding of this selection of drugs to the protease (no account is made for free energy penalties associated with protein distortion leading to the over estimation of the binding strength of the two largest inhibitors ritonavir and atazanavir). We obtain improved rankings and estimates of the relative binding strengths of the drugs by using a novel combination of MMPBSA/MMGBSA with normal mode entropy estimates and the free energy of association calculated directly from simulation trajectories. Our work provides a thorough assessment of what is required to produce converged and hence reliable free energies for protein-ligand binding.