COVID-19 disease spread modeling by QSIR method: The parameter optimal control approach

Background At present, India is in the decreasing phase of the second wave of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). But India as a country is in the second position in a high number of confirmed cases (33,678,786) in the world (after the United States of America) and third position in the number of COVID-19 deaths (after the United States and Brazil) at 465,082 deaths. Almost above numbers are dominantly seen in the second wave only. Thus, future long-term projections are required to mitigate the impact. Methods The conventional SIR model was modified so that a new compartment Q(quarantine) is added to the conventional SIR model to analyze the COVID-19 impact. The parameter optimal control technique was used to fit the curve by estimating the infection, susceptible, etc. Results The model predicts the cumulative number of cases of 2.6928E7 with a confidence interval of 95%, CI[2.6921E7,2.6935E7], and an accuracy of 99.3% on May 25, 2020(480th day from 30 to 01–2020). The estimated R0 is 1.1475. The model's mean absolute error(EMAE) is 1.79E4, and the root-mean-square error is (ERMSE) is 3.19E4. The future projection are,3.48E7(Lockdown), 3.80E7(periodic-lockdown), 4.52E7(without lockdown). The whole model accuracy is 99%, and projection accuracy is about 94% up to 01-Nov-2021, The goodness of fit value 0.8954. Conclusion The model is over-estimating corona cases initially and then showed a decreased trend. As the number of days increases, the model accuracy decreases; thus, more control points of the cost function are required to fit the model best.


Introduction
The Novel Coronavirus first emerged on January 27, 2020, in Kerala, India. 8 The COVID-19 evolved gradually and appeared to 171 cases by March 18, 2020. So from March 24, 2020, the country-wide lockdown was implemented. The lockdown that was imposed throughout the country was lifted in stages starting from June 8, 2020, in the name of Unlock 1.0 and then progressed up to Unlock 6.0. The per-day new cases are less in number up to the month of June. Later on the virus showed exponential growth in India and reached approximately 97000/per day new cases in mid-September. Because of mitigation strategies such as restricting the transport, stopping mass gatherings, creating awareness on Covid-19 spread by government and new normal such as face mask 20 and social distancing and sanitizer use decreased and the number of new cases decreased and in control. Still, India imposed face masks mandatory. We can't conclude that the Covid-19 spread was in favor of wearing masks. According to WHO wearing a face mask alone is not sufficient. 20 Though India followed all kinds of restrictions, because of the lifting of the lockdown, India saw the worst spread of the virus than it did in the first wave from February 2021. The second wave was started in Maharashtra, India, from the month of march and progressed up to May 2021 due to several socio-economic and administrative reasons. The second wave dynamics have a massive impact on human health and the economy of the country. In the second wave, new cases reached up to 4 lakhs. To mitigate the effects of the COVID-19 s wave, the government took a lot of preventive measures such as quarantine, 3 , 4 social distancing, 7 vaccination drive along with treatments such as plasma therapy. 1 As a part of controlling the spread of the virus, vaccines such as Covishield, Covaxin are introduced. With a great effort, the vaccination drive achieved a total number of Figure 100 crore. Along with mitigation strategies, the industrial revolution 5.0 23 created a huge impact on the health care industry by introducing new technologies such as Covi-19 tracking, digital monitoring, optimizing the supply chain, performing an automatic treatment, etc. The revolutions like IoT, 21,22 which is very useful in transferring data without any human interface, create a better treatment rapidly. Covid-19 patients by making the bridge between the medical health monitoring things to the network through Artificial intelligence, Machine learning, Big data network, and cloud storage, etc., and reduce the re-admission into hospital post-Covid.
All these efforts were succeeded in reducing the spread of coronavirus in the second wave. Thus, we are on the verge of ending the second wave and also probably the end of the pandemic. Still, we never know how pandemics may turn endemic. Since the nature of the virus and its mutations are changing rapidly over time, resulting in various strains which are totally different from early-stage dynamics. 14,16 Significant strains like U.S strain L452R and indigenous strain E484Q are dominantly seen in states like Maharashtra, Karnataka in India. The mutated strains like E484Q and L452R had high transmission rates. So there is a need to construct new models to tackle the further waves due to unexpected mutations of the virus. Many attempts have been made on epidemiology models to predict active cases, including simple SIR and logistic growth models 9 and complex models like MSIR and SEIR models. 10 The accuracy of these models may differ based on their assumptions and parameters. However, all these attempts are made in the early stage of the spread of COVID-19. In this paper, the main focus is given to the second wave, and an attempt was made to predict the probable time to end the pandemic. Previous research 5,6 like SIS, SEIR, exponential growth, logistic growth models did not take into account the quarantine, lockdown, and social distancing strategies 15,17 and re-infections, vaccination effects to control the pandemic. Thus, by considering all the above assumptions, the nature of the second wave and future projection of cases was done. The approximate end of the pandemic was also predicted.

Methodology
In this analysis, a new compartment is added to the SIR model. Q is the newly added compartment named, here Quarantine means only people who are not affected by corona, and the following self-isolation because of lockdown guidelines. Persons with already Covid-19 positive are not considered in chamber Q. Where S to Q won't exchange when there is a lockdown or self-isolation, those people will not be affected by COVID-19 for a brief period. At the same time, some people will go from quarantine compartment to susceptible because of the end of lockdown or violation of lockdown. Thus, here two rates are defined such that the people exchange between Q and S. The below Fig. 1 shows the flow between two compartments (see Fig. 2).
Where α is the susceptible rate, and δ is the quarantine rate.
2. The fraction of people always goes from Q to S with a rate α in both with and without lockdown cases. But in lockdown, the fraction of people will go from S to Q with a rate δ. 3. The first lockdown was implemented for 100 days, from 170th to 270th day. i.e., up to unlock 6.o till November 30. 4. The vaccination rate is considered from real-time data. 5. The second lockdown is taken from the 460th day to the 480th day. 6. The future projection was made with three cases. The first one is the Total lockdown and second one is periodic lockdown of 15 days will be implemented for every one month. The last one is no further lockdown will be implemented at all, and it is assumed that only curfew will be implemented(In curfew, δ C is lesser than δ). 7. Re-infections are considered with a period of 120 days.

Mathematical equations
Initially, all the parameters are divided with the population size P0.
The rates δ and γ are chosen from the trial and error method, and β and α are estimated by parameter optimal control approach.

Parameter optimal control
Here the parameter to optimize is the cumulative number of cases, and control vectors are β and α. The cost function is defined as Where θ is the cumulative number of corona cases up to certain period from the start of simulation.
θ 0 is the observed cumulative number of cases taken from real-time data.

Steps to parameter optimal control
1. In the first step, choose β and α as an initial guess.    6. Then the required β.and α is β = β-C 1 × (dF) β and α = α-C 2 × (dF) α where C 1 and C 2 are constants. 7. Repeat the procedure until the best fit of the curve is achieved.
For the detailed method of optimal control, please refer. 11

Reproduction number
The basic reproduction number is the same as the SIR model since the infection compartment was not had any intervention of the quarantine compartment. Here basic reproduction number R 0 = β γ tell us that how many people are infected by every single individual infected person. The above formula was calculated from the next-generation matrix method. 2,18,12 The estimated R 0 for this method is 1.147.

Error estimation and goodness of fit
The error is estimated from two formulas. 19 Mean absolute error (E MAE ): it the average error by only considering maginited and leaving direction and Root mean square error: It is the average of squared error and particularly useful when we have large errors between observed and predicted Here accuracy of the whole model is defined as (E) and accuracy at a particular day is simply.
Goodness of fit from Normalized mean square error: Here the p(i) is the per day active number of cases of the model, and o(i) is observed per day active cases. (o) is the mean of observed. It is the measure of fit from observed value, which is most useful in predicting the future.

Accuracy of model
The real time data of Covid-19 cases from 30-Jan-2020 to 25-May-2021 was taken to model the spread of Covid-19. Table 1 shows the α, β, γ, δ values in different conditions stated in the assumption section.
The vaccination rate is considered from real-time data, which is shown in the below Fig. 3.
The predicted cumulative number of cases as of date May 25, 2021, was 2.6928E7 with CI[2.6921e7,2.6935e7], and observed cases from real-time data was 2.6752E7; the model accuracy is 99.3%. From the below Fig. 4, the model predicts that the maximum per day active cases in the first wave was 103688 on the 7-Sept-2020 (224th day from 30-Jan-2020). The observed cases are 97950 on the 14-Sept-2020 (231th day from 30-Jan-2020), and there is an offset of 7 days with an accuracy of 94.36%. The maximum per day active cases in the second wave was 3.8385E5 on the 18-May-2021 (473rd day from 30-Jan-2020). The observed cases are 4.1418E5 on the 7-May-2021 (463rd day from 30-Jan-2020), and there is an offset of 10 days with an accuracy of 92.7%.
The mean absolute error of model E MAE is 1.79E4, and root-meansquare error E RMSE is 3.19Ee4. The goodness of fit value is 0.8954.

Future projection
The below Fig. 6 shows the prediction of the future with three cases. The predicted number of days required to end the pandemic with continued lockdown is about 600 days(Oct-2021), and the cumulative number of cases at the end of the pandemic is 3.4890E7. And if the periodic lockdown of 15 days for every one month is implemented, then the pandemic will end after approximately 720 days(Jan-2022), cumulative cases at the end of the pandemic is 3.80E7 and if a complete  nationwide lockdown is not imposed, then there will be a whopping increase in the number of cases and the rate of spread of the virus. The pandemic will be continued up to approximately 825 days(May-2022), and cumulative cases will be 4.5E7.
The detailed data was given in Table 2 at the appendix.

Discussion and conclusion
From the model data, the per day active cases are overestimated with change of 6% in first wave, and then in the second wave, the model underestimated about a change of 8%. As the number of days increasing, the accuracy is decreasing with underestimating trend. This trend suggests that only one optimal control point of cost function is not sufficient to predict the cases accurately. It indicates that more cost function points are required to estimate the cases accurately. From projection, the end of the pandemic, even in the worst-case scenario is on May 2022. From the figure, the model shows that there is a drastic decrease in cases after Aug-2021. The main reason for this is an increase in the vaccination rate.
Though the model predicts the active cases with utmost accuracy, the future projection may not be justified because the spread of the virus depends on several factors, such as different pathogenic mutations of the virus, which is highly random and unpredictable and effectiveness of vaccination and preventive measures. Even for the people vaccinated, the mutations affected them drastically. The only way to control the spread is self-quarantine and isolation. Since the lifting of lockdown clearly shows an increment in cumulative active cases, one needs to take care and follow the preventive measures, and the rate at which vaccines are being jabbed must increase. This analysis was done on the whole country, but in a real case scenario, the state-to-state dynamics change spatially and temporally, so adding compartments that account for the spatial progress of COVID-19 adds much more benefits to the future projection.

Limitations of model
1. The sensitivity of the model highly depends on the values of α, β, γ, δ even in a small increment leads to change on prediction of the model. 2. Even vaccination rate is taken from real-time data, and the Covid-19 needs multiple shots. Thus, there is a lag inherently in the model. 3. Re-infection is taken with a constant duration, but Re-infection dynamics will significantly vary in Spatio-temporal pattern. This is a major cause for the varying number of waves and active cases in different countries. 4. The initial guesses will affect the number of iterations to optimize, so care has to be taken in choosing the initial guesses. 5. The values of γ, δ are estimated from the trial and error, which is a difficult task when we are modeling for multiple countries. The values will vary with a significant change in order.

Research support
This research received no external financial or non-financial support.

Relationship
There is no additional relationship to disclose.

Patents and intellectual property
There are no patents to disclose.

Other activities
There are no other activities to disclose Data Statement The data required to reproduce the above findings are available to download from https://github.com/CSSEGISandData/COVID-19 https://ourworldin data.org/coronavirus/country/india.

Declaration of competing interest
The authors declare that they have no conflict of interest.

Appendix A. Supplementary data
Supplementary data to this article can be found online at https://doi.org/10.1016/j.cegh.2021.100934.