In the context of Web 4.0, Federated Learning becomes an indispensable tool in the exploration and development of privacy-preserving machine learning models. This approach aligns seamlessly with the principles of symbiotic interaction within Web 4.0, facilitating the creation of machine learning interfaces that leverage decentralized data without compromising individual privacy. By incorporating Federated Learning, the project enhances its ability to process sensitive data across multiple decentralized sources—such as user interactions across various web platforms—while ensuring that personal data remains on the user's device. This method not only respects user privacy but also enriches the machine learning models with a diverse range of data inputs, leading to more personalized and efficient user experiences.
Moreover, the integration of Federated Learning within Web 4.0 frameworks underscores a significant shift towards more ethical and user-centric design philosophies. It exemplifies how advanced technologies can be harnessed to foster collaborative advancement, protect individual privacy, and enhance the collective intelligence of the system without centralizing personal data. This approach contributes to the development of a more secure, efficient, and privacy-preserving Web 4.0 ecosystem, where machine learning models are continually refined and improved through encrypted, aggregated updates shared across devices. Thus, Federated Learning stands as a cornerstone in the advancement of exploratory machine learning for symbiotic Web 4.0, embodying the project's commitment to privacy, collaboration, and innovation.
The need for a "Federated Learning Platform for Multi-Drug Resistance Research" arises from the pressing challenges posed by multi-drug resistant infections and the critical importance of maintaining patient privacy while harnessing data across multiple healthcare institutions. Federated learning, as described, enables multiple hospitals to contribute to a joint research initiative without the need to share raw patient data. This decentralized approach aligns perfectly with the sensitive nature of handling medical records and the logistical complexities of multi-institutional research.
In 2025, COVID-19 has transitioned into a pattern of periodic regional flare-ups rather than a synchronized global surge. Several countries in East Asia experienced a noticeable resurgence of cases in the spring of 2025, while many Western countries saw relatively stable or declining trends during the same period. Below is a detailed breakdown of monthly positivity rates (the percentage of COVID-19 tests coming back positive) from January through June 2025 in key regions, followed by analyses of trend peaks, government responses, an overview of former U.S. President Donald Trump’s stance, and a comparison of original vs. updated COVID-19 vaccines as of 2025.
South Korea began 2025 with low COVID-19 activity and very low test positivity rates. In January–February 2025 , the positivity rate remained around 1–2%, reflecting a quiet period following the previous winter’s infections. A modest uptick occurred in March: by late March (around week 13 of 2025) , South Korea’s positivity rate peaked at approximately 13%, indicating a spring wave. This wave was short-lived – through April , cases subsided rapidly. By early May, the weekly positive rate had dropped back down to only about 2–3%, a very low level. However, in late May there were early signs of another increase: the positivity rate rose again to roughly 8–9% toward the end of May. As of June 2025 , South Korea’s positivity rate remains moderate (mid-single digits) and under close observation, with officials watching for a possible summer resurgence. Overall, from January to June the country saw a brief spring spike followed by a return to relatively low positivity, with a slight upward trend re-emerging at the start of summer.
Japan did not experience a dramatic COVID-19 wave in the first half of 2025, especially compared to some neighboring regions. During January–February 2025 , Japan’s reported positivity rates stayed quite low (generally in the low single digits) amid a steady decline from a prior winter uptick. No significant nationwide surge was evident in early 2025. In March and April , Japan saw a mild increase in infections – a seasonal “mini-wave” – but the impact was limited: test positivity crept up slightly (remaining generally under 5–6%). By May , any small spring rise appeared to level off, and positivity rates stabilized again at relatively low levels. Entering June 2025 , Japan’s COVID-19 positivity rate is low and stable. In summary, January through June in Japan was characterized by low endemic levels of COVID-19, with only minor fluctuations. The country did not face a pronounced resurgence in early 2025, and weekly positive test rates have largely remained minimal, thanks in part to widespread immunity and ongoing vigilance.
China saw a notable COVID-19 resurgence in spring 2025. After a quiet start to the year, with very low positivity in January–February 2025 , viral activity began climbing in March. By late March to early April , the positivity rate (especially among patients with respiratory symptoms) had risen to around 7–8%. This growth accelerated through April and into May. During April 2025 , infections spread widely; the test positivity rate more than doubled over the course of about five weeks. By early May 2025 , China’s national COVID-19 positivity rate had surged to approximately 16%, reflecting a significant wave approaching levels last seen in early 2023. The trend continued upward into mid-May; by late May , positivity was reported to be nearing a peak (around the 20% mark nationally, close to the previous record high of ~21% positivity). Public health experts in China noted that this wave was peaking at the end of May. Indeed, as of June 2025 , signs suggest the wave in China is cresting and starting to decline. In summary, from a low baseline in Jan–Feb, China’s positivity climbed sharply through March, April, and May, reaching a high in late May. Early June data indicate that this spring 2025 wave is at or just past its peak, with positivity expected to fall in the coming weeks.
Hong Kong experienced a pronounced uptick in COVID-19 activity in the spring of 2025. Throughout January–March 2025 , Hong Kong’s positivity rates were very low and stable, with minimal cases (the region had enjoyed over half a year of low transmission following the last wave in summer 2024). However, in April 2025 infections began to rise again. The proportion of respiratory specimens testing positive increased significantly in April, jumping from roughly 6% in early April to double that level by the end of the month. By mid-May 2025 , Hong Kong’s COVID-19 positivity rate had reached about 13–14%, the highest level observed in over a year. This indicates that the spring wave there was quite substantial, surpassing the previous summer’s peak in terms of positive test share. In the latter half of May , there were early signs that the wave was plateauing: the weekly positivity rate eased slightly (falling from its ~13.8% peak to around 11% by late May). Public health officials noted that while cases started to decline at the end of May, the overall positivity remained high. As of June 2025 , Hong Kong’s positivity rate is still elevated (hovering around the low double-digits), but it is on a downward trend after that mid-May peak. Overall, Hong Kong went from negligible COVID levels in Jan–Mar to a steep rise in April, a peak in mid-May, and is now seeing a gradual decline moving into June.
The United States has not seen a major COVID-19 resurgence so far in 2025. In fact, the national trends from January to June 2025 show relatively low virus circulation. During January 2025 , the U.S. was coming off the typical winter holiday rise; test positivity at the start of the year was moderate (on the order of a few percent) but far lower than during the pandemic’s earlier years. Through February and March 2025 , positivity rates steadily decreased as the winter wave subsided. By the spring, U.S. COVID metrics were at some of their lowest levels since the pandemic began. In April 2025 , nationwide test positivity hovered around 3–4%. This low plateau continued into May ; in fact, the positivity rate further declined slightly in May, reaching roughly 2–3% by mid-May. By June 2025 , the U.S. positivity rate remains very low (around 2% nationally in the most recent reports) with no significant uptick noted. Overall, from January to June the U.S. trend has been one of decline and stabilization: after a minor winter bump, the country saw continuous improvement, and no spring/summer wave has materialized through early June. Regional and community variations exist, but on the whole, the United States maintained low transmission and low positivity percentages throughout the first half of 2025.
Germany’s COVID-19 situation in the first half of 2025 has been largely stable with only modest fluctuations. In January 2025 , Germany was emerging from the winter respiratory virus season; any COVID-19 increase over the holidays was already waning. As a result, positivity rates in January were on the decline, generally falling into low single digits by month’s end. Through February and March 2025 , Germany experienced relatively low COVID circulation. There was a small bump in early spring – driven in part by the spread of new Omicron subvariants – but this surge was limited. During March, test positivity in Germany stayed roughly in the 2–5% range and did not spike dramatically. Moving into April 2025 , the trend remained flat or slightly improving; hospitals and surveillance data indicated no significant strain. By May 2025 , Germany’s COVID-19 positivity rate was very low (around 2–3% from sentinel surveillance). Wastewater monitoring and test data showed no rising trend in late spring. As of June 2025 , Germany continues to report only sporadic cases and a low positivity percentage, with COVID activity at a minimum level. In summary, Germany saw a mild uptick in early spring that quickly stabilized, and from January to June the country maintained generally low positivity rates, reflecting a controlled situation.
France’s COVID-19 positivity rates from January to June 2025 have remained relatively contained, albeit with some peculiar data points due to changes in testing practices. In January 2025 , France was coming off a late 2024 wave, and cases were decreasing; positivity rates were moderate but falling. By February and March 2025 , France’s reported positivity was low and steady (in the low single digits) as widespread testing had been scaled back – only symptomatic individuals were mostly being tested by that point. In April 2025 , France launched a spring booster campaign for vulnerable groups, and COVID metrics stayed fairly stable; any small increases in cases were localized and did not significantly push up national positivity. Interestingly, by May 2025 , official data showed an uptick in the percentage of tests coming back positive – one report indicated roughly 15–20% positivity in late May. This figure, however, reflects the fact that testing is now focused primarily on those already suspected of illness (leading to higher proportions of positives among fewer tests). Health authorities noted that overall infection levels remained low and that the health system was not under strain. As of June 2025 , France’s COVID-19 positivity rate has eased back down to lower levels as the minor spring rise subsided. In practical terms, from January through June the country did not undergo any major COVID wave; positivity rates stayed generally low, apart from statistical anomalies due to limited testing, and no significant resurgence occurred in early 2025.
Italy has seen a calm COVID-19 situation in the first half of 2025, with no substantial resurgence and consistently low positivity rates. In January 2025 , Italy was recovering from the tail of the winter season; infection numbers were decreasing and positivity was moderate (but nowhere near earlier pandemic highs). By February , the positivity rate had dropped to low levels (a few percent or less) as the country exited winter. Throughout March and April 2025 , Italy’s COVID-19 metrics remained stable. There were minor increases in localized areas (possibly due to Omicron subvariant circulation), but overall test positivity stayed in the low single digits. The absence of any large new variant wave meant that hospitals saw very few severe cases relative to previous years. In May 2025 , positivity rates in Italy were reported at roughly 1–3%, indicating minimal spread; this trend continued without any notable spike. By June 2025 , Italy’s COVID-19 positivity remains very low. Essentially, from January to June, Italy did not experience a new wave. Each month’s data showed either steady or declining positivity, and the virus was kept largely at bay thanks to widespread immunity and ongoing preventive measures for vulnerable groups. The country’s early 2025 trend is one of quiet persistence of the virus at an endemic low level, with no sign of a surge during that period.
Current Trends: As outlined above, the most significant COVID-19 increases in 2025 have been concentrated in parts of Asia (notably China, Hong Kong, and some Southeast Asian countries) while many other regions (including North America and Europe) have seen stable or declining trends. In countries experiencing a resurgence, the growth rates in March–May 2025 were quite steep. For example, China saw its test positivity jump from near 0% in late winter to well over 15% by early May. Hong Kong similarly more than doubled its positivity rate in about a month’s time. These rapid increases suggest a short, intense wave pattern characteristic of Omicron subvariant surges.
Estimated Timing of Peaks: Using recent growth rates and public health commentary, it appears that the spring 2025 wave in Asia either has just peaked or is about to peak by early summer. China’s top epidemiologists indicated in late May that the country’s infection wave was reaching its apex and would likely start declining in June. This aligns with observed data: by the end of May, China’s weekly case numbers began leveling off, marking a peak after roughly two months of rising cases. Hong Kong’s officials likewise noted in mid to late May that positivity rates were beginning to edge down from a high point, suggesting that Hong Kong’s wave crested around mid-May. They expect elevated case levels to persist for a few weeks post-peak, but a gradual downward trend is forecast through June and July. South Korea, which mostly avoided a spring spike, is watching for a possible summer increase; based on last year’s pattern and neighbors’ experiences, Korean experts anticipate a potential peak in mid to late summer (July or August 2025) if a seasonal wave develops. Japan did not have a pronounced spring wave; any future peak would likely correspond with a later seasonal cycle (possibly summer or winter, depending on variant behavior).
Global Outlook and Seasonality: In regions like Europe and North America, where no significant wave materialized in early 2025, the current expectation is that there may not be a major “summer 2025” surge unless a highly transmissible new variant emerges. Surveillance data up to June 2025 show very low positivity and no upward momentum in these areas. Public health commentators in Europe have suggested that COVID-19 may settle into a more predictable seasonal pattern (potentially peaking in winter along with other respiratory viruses), although the virus has not yet shown a perfectly regular seasonality. For now, Western countries are in a valley of low transmission; any new peak might not occur until the late fall or winter of 2025. However, health authorities continue to monitor for any signals of resurgence. In summary, the “current wave” of COVID-19 in 2025 is largely an Asian phenomenon that appears to be peaking around May–June 2025, after which those countries should see relief. Elsewhere, no large wave is peaking at this time – the situation is more one of vigilance, with an eye toward the latter half of the year for any new developments.
With COVID-19 becoming an endemic illness, governments worldwide have adjusted their response strategies. Below is an overview of current policies and responses in each of the specified countries/regions regarding the 2025 resurgence potential. These include stances on lockdowns, mask mandates, vaccination efforts, and travel rules.
South Korea has moved away from heavy-handed restrictions and instead emphasizes readiness and vaccination to handle COVID-19:
Japan has adopted a “with Corona” stance, treating COVID-19 more like a seasonal illness and relying on voluntary measures rather than mandates:
China underwent a major policy shift in late 2022 (moving away from “zero-COVID”), and in 2025 its government response reflects a normalized approach with an emphasis on vaccination and targeted interventions:
Hong Kong, having endured some of the world’s strictest COVID rules in earlier phases, has relaxed nearly all mandates by 2025 and adopted a targeted, health-focused approach:
By 2025, the United States has largely transitioned back to pre-pandemic norms in terms of COVID-19 restrictions, with efforts focused on vaccination and targeted health guidance rather than broad mandates:
Germany’s pandemic response has transitioned to a maintenance mode, ensuring preparedness while avoiding intrusive measures:
France in 2025 has integrated its COVID-19 response into the general public health framework, focusing on vaccination and personal responsibility, with no extraordinary restrictions in place:
Italy, once at the epicenter of the pandemic, has relaxed into a state of managing COVID-19 similarly to other common viruses by 2025, emphasizing monitoring and caution for at-risk groups:
Former U.S. President Donald Trump has been vocal about his views on COVID-19 as the situation has evolved into 2025. His recent public statements and actions (especially in light of a potential political comeback) emphasize skepticism of restrictions and a return to normalcy. Below is a summary of Trump’s stance on key aspects of the COVID-19 issue in 2025:
Overall, Donald Trump’s current stance on COVID-19 in 2025 centers on the idea that the pandemic is effectively over as a national emergency and that any attempts to reintroduce pandemic-era restrictions are misguided or politically driven. He positions himself as the candidate of normalcy and freedom, vowing to prevent a return to school closures, mask mandates, or vaccine requirements, and he continues to push narratives (like the lab leak theory) that assign blame to opponents both foreign and domestic for the hardships of the pandemic.
Written on June 1, 2025
The table below compares several major COVID-19 vaccines – including the original first-generation vaccines and newer or updated vaccines in use or development by 2025. The comparison covers their names, manufacturers, countries of origin, release dates, vaccine types, efficacy data, side effects, regulatory status, and current usage or development stage:
Vaccine Name | Manufacturer | Country of Production | Original Release Date | Type of Vaccine | Reported Efficacy | Major Side Effects | Regulatory Status | Current Use/Development (2025) |
---|---|---|---|---|---|---|---|---|
Pfizer–BioNTech (Comirnaty) | Pfizer (USA) / BioNTech (Germany) | USA & Germany | Dec 2020 (EUA in US) | mRNA vaccine (lipid nanoparticle-encapsulated mRNA) | ~95% efficacy in preventing symptomatic COVID (original strain). Efficacy against Omicron variant infection is lower (around 40–50% after two doses), but still >85% effective at preventing severe disease with boosters. | Injection-site pain, fatigue, headache, muscle aches, mild fever. Rare cases of myocarditis (inflammation of the heart, mostly in young males after the 2nd dose) have been reported. Generally short-lived side effects. | Full FDA approval for ages 16+ (and later for 12+; brand name Comirnaty). Emergency/conditional approvals in 100+ countries. Updated booster formulations (e.g. Omicron XBB.1.5-specific) authorized under EUA in 2022–2023. Widely endorsed by WHO and regulatory agencies globally. | Remains a mainstay vaccine worldwide. Original formulation used in 2021–22; bivalent and monovalent Omicron-updated boosters rolled out in 2022–2023. In 2025, used for initial series (in some places) and updated annual boosters. Ongoing development of next-gen mRNA shots for broader variant coverage continues. |
Moderna (Spikevax) | Moderna, Inc. | United States | Dec 2020 (EUA in US) | mRNA vaccine (nucleoside-modified mRNA in lipid nanoparticles) | ~94% efficacy in initial trials against symptomatic COVID (ancestral strain). Efficacy reduced against Omicron infection (~40% after primary series) but high protection (~90%+) against severe outcomes after booster. Durable immunity observed, though boosters needed for variants. | Similar to Pfizer: injection-site soreness, fatigue, headaches, fever/chills (systemic effects more pronounced after second dose). Rare myocarditis cases have been observed (again, mostly in younger males). Generally well-tolerated; side effects typically resolve in 1–3 days. | Full FDA approval (Spikevax) for adults. Authorized in many countries worldwide. Bivalent Omicron booster authorized in late 2022; updated monovalent XBB booster authorized in 2023. EMA and other regulators fully approved various age indications. No significant safety recalls; product label includes myocarditis risk information. | Widely used as primary and booster vaccine. By 2025, Moderna has pivoted to supplying updated boosters annually (often in fall). Trials for a combined COVID-19 + influenza mRNA vaccine are underway, with the goal of a single shot for both illnesses in the future. Moderna is also researching pan-coronavirus vaccines to cover multiple variants. |
Oxford–AstraZeneca (Vaxzevria) | Oxford University / AstraZeneca | United Kingdom (developed), produced in multiple countries (UK, EU, India) | Jan 2021 (first approvals in UK, EU) | Viral vector vaccine (Chimpanzee adenovirus Ad26 carrying spike protein DNA) | ~70% efficacy against symptomatic COVID in Phase III (original strain, with a single standard dose regimen). A longer interval between doses raised efficacy into the 80% range. Highly effective against severe disease. Efficacy against infection by Beta/Omicron variants dropped significantly (protection against mild illness low without boosting, but remained >80% effective against hospitalization after three doses). | Common side effects: injection-site tenderness, fatigue, mild headache, feverish feelings, and chills for 1–2 days. Notable rare side effect: Vaccine-Induced Thrombosis and Thrombocytopenia (VITT) – an unusual clotting disorder – occurred in approximately 1 in 100,000 to 250,000 recipients, more often in younger adults (especially women under 50). This led to age-based usage restrictions in many countries. Also rare Guillain-Barré syndrome cases flagged as possibly associated. | Approved for emergency use in over 100 countries and by WHO (EUL). Never authorized in the US (the company withdrew its FDA application). Many European countries phased out or limited its use by late 2021 due to the rare clotting issue, switching to mRNA vaccines. Still used in many low- and middle-income countries in 2021–22. By 2025, Vaxzevria is still authorized by regulators, but in practice it’s less commonly used given availability of other vaccines. Some countries like Canada and Australia fully approved it initially, though usage later declined. | Use in 2025 is limited. It played a critical role in early vaccination globally. Now largely supplanted by other vaccines for boosters. Some countries still use AstraZeneca’s vaccine or its Indian-manufactured counterpart (Covishield) for initial doses in populations with low access to mRNA vaccines. The manufacturer did not continue making variant-specific updates widely; however, the technology is being repurposed for other diseases. Existing stockpiles are occasionally donated to countries in need. Overall, AstraZeneca’s COVID vaccine is a legacy product by 2025, with most immunization programs having moved on to other options. |
Johnson & Johnson (Janssen) | Johnson & Johnson (Janssen Pharmaceuticals) | United States (manufactured in US, Europe, etc.) | Feb 2021 (EUA in US) | Viral vector vaccine (Human adenovirus 26 vector encoding spike protein) | ~66% efficacy against moderate COVID-19 (global trials, all strains) after one dose; ~85% efficacy against severe disease. Provided strong protection against hospitalization and death in early studies. Against Omicron, a single J&J dose offered much less protection from infection (~13% in some analyses), but a second dose or mix-and-match booster significantly improved efficacy. Many recipients eventually needed an mRNA booster for optimal protection. | Typically mild side effects: injection-site pain, headache, fatigue, and fever for a day or two. A rare but serious side effect was Thrombosis with Thrombocytopenia Syndrome (TTS – similar to the AstraZeneca clotting issue), occurring at a very low rate (estimated ~1 in 500,000). This side effect led US health authorities to prefer other vaccines. Another rare adverse event: Guillain-Barré syndrome was reported in a small number of cases following J&J vaccination. Overall, the one-dose regimen was well tolerated aside from these rare events. | Received Emergency Use Authorization in the US and was authorized in Europe and many other regions. It was a single-dose vaccine, which gave it early appeal. However, due to the TTS safety signal, the CDC and FDA in the US in 2022 recommended limiting its use to situations where other vaccines were not available or for those who specifically requested it. By mid-2023, J&J/Janssen had effectively stopped global distribution; many countries curtailed its use. The FDA’s EUA still technically stands but with warnings. The WHO listed it for emergency use in 2021, though its role diminished later. | By 2025, the J&J vaccine is largely out of use. Production was scaled back significantly in 2022. Most remaining doses expired and were disposed of by 2023. Some developing countries used J&J in 2021–2022 for hard-to-reach populations (thanks to one-dose convenience), but they have since transitioned to two-dose or booster strategies with other vaccines. There are no new variant-specific versions of J&J’s vaccine publicly released. Essentially, the Janssen COVID-19 vaccine has been phased out of vaccination programs, and regulatory focus on it is minimal except for monitoring long-term safety in those who received it. |
Novavax (NVX-CoV2373, Nuvaxovid) | Novavax, Inc. | United States (manufactured in USA, EU, India etc.) | Late 2021 (Nov 2021 initial authorizations abroad; July 2022 EUA in US) | Protein subunit vaccine (recombinant SARS-CoV-2 spike protein + Matrix-M adjuvant) | ~90% efficacy in Phase III trials (pre-Delta variants) against symptomatic COVID. Real-world effectiveness remained high against severe disease. Against Omicron, initial two-dose efficacy was lower for preventing infection (~50–60%), but still provided significant protection against severe outcomes. A booster of Novavax or mixing with mRNA restores higher effectiveness. Clinical data suggests robust immune responses including broader T-cell response that may maintain efficacy against new variants to a degree. | Side effect profile is similar to other vaccines but generally slightly more mild in systemic reactions: injection site pain, fatigue, headache, muscle pain for a day or two. Fever is less common than with mRNA vaccines. Rare allergic reactions can occur (as with any vaccine). There have been a few reports of myocarditis following Novavax as well, although it’s unclear if causally linked – these were rare and flagged during some regulatory reviews. Overall, Novavax has been well tolerated, with many viewing it as a “traditional” alternative to mRNA shots. | Authorized for use in over 40 countries, including full or conditional approvals in the EU, UK, Canada, Australia, and WHO EUL. In the US, it received emergency use authorization for adults in 2022 and later for adolescents. It has not yet achieved full FDA approval (as of 2025) but is under review. Novavax was initially approved as a two-dose primary vaccine; later it was also authorized as a booster option (including heterologous boosting after other vaccines). Regulatory agencies have kept Novavax as an option particularly for those who cannot or will not take mRNA vaccines. | In 2025, Novavax’s vaccine is used as an alternative option in many vaccination programs. It appeals to individuals wary of mRNA technology or who prefer a more traditional protein-based vaccine. Several countries use Novavax for booster doses; for example, in the US it’s available as a booster for adults who haven’t had one in the past. The company also developed an updated formulation targeting Omicron variants – this updated Novavax booster received approvals in late 2023. Additionally, Novavax is working on combination vaccines (e.g., a combined COVID-19 and influenza shot) leveraging its protein subunit platform; those are in clinical trials. Despite some financial and manufacturing challenges, Novavax continues to be part of the vaccine mix in 2025, albeit with a smaller market share compared to mRNA vaccines. |
Sinopharm BBIBP-CorV | Sinopharm (China National Biotec Group) | China | Dec 2020 (China conditional approval) | Inactivated whole-virus vaccine (chemically inactivated SARS-CoV-2 virus with alum adjuvant) | Reported 79% efficacy against symptomatic infection in Phase III trials (primarily against original strain). Real-world effectiveness varied: for example, ~50–70% protection against infection observed in different countries early on, and higher effectiveness against severe disease. Against Omicron, two doses of Sinopharm provide limited protection from infection (significantly reduced neutralization), but a third dose (booster) improves efficacy, particularly in preventing hospitalization and death. Overall, its performance was somewhat lower than mRNA or viral vector vaccines, especially as newer variants emerged. | Generally mild side effects: injection site pain, mild fever in a minority of recipients, fatigue, etc. Not associated with the rare clotting or myocarditis issues seen in some other platforms. It has a strong safety record in terms of serious adverse events. Some studies noted slightly higher incidence of brief elevated liver enzymes post-vaccination, but not causing clinical illness. By and large, it’s considered very safe, particularly for older adults who might have been less tolerant of more reactogenic vaccines. | Authorized in China and granted WHO Emergency Use Listing in May 2021, facilitating its global use. It was used in dozens of countries, especially across Asia, Africa, the Middle East, and South America. Many governments deployed Sinopharm as part of early mass vaccination. By 2025, Sinopharm’s vaccine has full approval in China for adults and children (down to age 3). Some other countries have also fully approved it or continue to use it under emergency authorization. Its acceptance in North America/Western Europe was limited (those regions did not approve it), but it’s recognized by WHO, so travelers vaccinated with Sinopharm are generally acknowledged globally. | Sinopharm’s vaccine remains in use, especially in China where the majority of the population (billions of doses) received it as primary series. China has continued to use Sinopharm and Sinovac as foundational vaccines and has administered booster campaigns with them, often mixing with newer vaccines. As of 2025, Sinopharm developed an updated version targeting Omicron variants and also has been working on a protein-subunit and an mRNA vaccine through subsidiaries. However, the classic BBIBP-CorV inactivated vaccine still sees use for initial vaccinations and for people who prefer a tried-and-true platform. Some countries with ample mRNA supply have phased out inactivated vaccines due to lower efficacy, but others continue using them thanks to their easy storage and well-known safety. Sinopharm shots are also part of China’s vaccine diplomacy; they continue to be donated to countries in need. Research in 2025 is looking at using Sinopharm’s vaccine as a base for a bivalent flu-COVID inactivated combo in the future. |
Sinovac CoronaVac | Sinovac Biotech | China | Feb 2021 (China conditional approval; widespread use began earlier under emergency use) | Inactivated whole-virus vaccine (beta-propiolactone inactivated virus with alum adjuvant) | Phase III trial data from various countries showed a range of efficacy: ~51% against symptomatic infection in Brazil (where the trial population were healthcare workers facing high exposure and variant circulation), ~65% in Indonesia, and ~84% against cases and 100% against hospitalization in Turkey (differences due to population and exposure). Essentially, CoronaVac’s efficacy against the original strain was moderate for preventing any symptomatic infection but quite high in preventing severe disease and death. As variants like Delta and Omicron emerged, two doses of CoronaVac offered limited protection against infection, though still helped reduce severe cases. A third dose significantly enhanced antibody levels and clinical protection. Studies in Chile and other countries have shown that with a booster, CoronaVac can achieve ~80%+ effectiveness against severe COVID-19. Without boosting, its efficacy against Omicron infection is very low, reflecting the immune escape of new variants. | Sinovac’s CoronaVac has a favorable safety profile. Common side effects include local pain at the injection site, headache, fatigue, and low-grade fever in a small percentage of recipients. There were no novel serious adverse events strongly linked to CoronaVac in widespread use. It did not show the clotting issues of adenovirus vaccines nor the myocarditis signal of mRNA vaccines. Some data indicated a slightly elevated risk of transient facial paralysis (Bell’s palsy) post-vaccination in a very small number of cases, which was observed in Hong Kong’s vaccine safety monitoring, but the cases recovered and the incidence was extremely low. Overall, it’s regarded as one of the safest COVID vaccines, often used in elderly populations for this reason. | CoronaVac has been authorized in 50+ countries and by WHO (Emergency Use Listing in June 2021). It was a workhorse vaccine for many developing nations in 2021 when mRNA supplies were scarce. China vaccinated a large portion of its population with Sinovac as well (alongside Sinopharm). By 2025, CoronaVac is fully approved in China for adults and children (it was one of the first vaccines authorized for children as young as 3). Many nations in Asia (e.g., Indonesia, Thailand initially) and Latin America (e.g., Chile, Brazil early on) included CoronaVac in their rollouts. Western countries did not use CoronaVac, but it’s internationally recognized via WHO. Regulatory status in 2025: still authorized and used, though in some places it’s been supplanted by higher-efficacy vaccines for boosters. | CoronaVac remains in circulation as of 2025, especially for initial immunization in populations that have logistical challenges (since it’s inexpensive to produce and store). Some countries continue to offer it as an option for those who might have allergies or contraindications to other vaccine types. However, given its lower efficacy against Omicron, many countries that used CoronaVac have switched to using mRNA or protein-based boosters to shore up immunity in those initially vaccinated with CoronaVac. Sinovac has been developing updated vaccines: one is an Omicron-specific inactivated booster, and another is a recombinant protein vaccine candidate. There’s also collaboration on an intranasal version. In China’s autumn 2022 booster campaign, for instance, many who had CoronaVac were offered a Pfizer mRNA booster or Sinovac’s own updated version, indicating a heterologous approach. Nonetheless, CoronaVac’s legacy is significant as it vaccinated hundreds of millions. It continues to be manufactured and is part of pandemic preparedness stockpiles. Its role in 2025 is mostly as a trusted platform for those who need a simple, reliable vaccine, and it forms the backbone of potential combination vaccines (Sinovac has discussed combining COVID and flu shots into one inactivated vaccine for the future). |
Sputnik V (Gam-COVID-Vac) | Gamaleya Research Institute of Epidemiology and Microbiology | Russia | Aug 2020 (Russian emergency approval; Dec 2020 mass rollout) | Viral vector vaccine (Two-dose regimen using two different human adenovirus vectors: Ad26 for first dose, Ad5 for second) | ~91.6% efficacy against symptomatic COVID-19 in published trial data (Lancet, based on original strain). It showed robust protection against severe disease and hospitalization. Real-world data from some countries (e.g., San Marino, Argentina) supported high effectiveness in practice prior to Omicron. With newer variants, exact efficacy is less clear due to limited new studies; it likely mirrors other adenovector vaccines in reduced protection against infection by Omicron. Sputnik V’s makers claimed it remained ~70-80% effective against Delta for severe disease. Against Omicron, unofficial reports suggest two doses might have considerably lower efficacy preventing infection (perhaps on the order of 20-30%), but boosting with Sputnik Light (single-dose Ad5, similar to a third dose) presumably increases protection. No comprehensive international data, as it wasn’t widely studied outside certain countries. | Short-term side effects: flu-like symptoms (fever, chills, muscle aches) reported by many recipients, especially after the second dose (Ad5 vector tends to be more reactogenic). Injection site pain, headache, and fatigue are also common. These reactions usually resolve in 1–3 days. There haven’t been well-documented unique serious adverse events tied to Sputnik V on a large scale, but transparent monitoring was limited. Notably, Russia did not report significant issues like VITT (clots) publicly, and countries using it also didn’t flag major safety alarms. Some scientists pointed out a theoretical risk of ADE (antibody-dependent enhancement) or influence of pre-existing adenovirus immunity on safety/efficacy, but in practice no major ADE problems emerged. Overall, available information suggests Sputnik V’s safety profile is broadly similar to other vaccines, with mostly mild to moderate transient side effects. | Approved for use in Russia and eventually authorized in around 70 countries at the pandemic’s height. However, it never gained approval from WHO or any major Western regulatory authority (EMA/FDA), largely due to incomplete data submissions and manufacturing transparency issues. This limited its international uptake – many countries ordered it in early 2021, but some deliveries lagged or were cancelled. By 2025, Sputnik V is fully approved in Russia and a number of allied or non-aligned countries. In others, emergency authorizations have lapsed as those countries moved to other vaccines. Russia also introduced Sputnik Light (single-dose version) and a nasal spray version (for booster use) for domestic use. Regulatory standing outside Russia is mixed; some countries in Africa and Latin America still recognize it for vaccination status, but others have phased it out. | Sputnik V is still being administered within Russia and a few countries that maintain supplies (such as Iran, which produces it locally under license, and some parts of Africa or the Middle East that received shipments). Its global presence has diminished in 2025 due to competition from WHO-approved vaccines and the logistical/logistical challenges Russia faced. There were announcements about a Sputnik V version adapted for Omicron, but widespread deployment of that is unclear. Russia has focused on boosting its population with a combination of Sputnik Light boosters and possibly mRNA vaccines co-developed with other countries (e.g., there was talk of a Sputnik mRNA in development). Internationally, travel acceptance of Sputnik V is a mixed bag – many countries now allow entry regardless of vaccine, but where proof is needed some still do not count Sputnik V, requiring travelers to get another vaccine. In summary, Sputnik V in 2025 is mainly of local importance in Russia and select regions. Its developers continue to advocate for its effectiveness and are trying to get it WHO-listed, but without success thus far. The vaccine’s technology (heterologous adenovirus prime-boost) proved effective originally, and research is ongoing to apply this platform to other diseases, even if its COVID-19 role has tapered. |
Bharat Biotech iNCOVACC (BBV154) | Bharat Biotech International Ltd. | India | Sept 2022 (Emergency approval in India as booster) | Intranasal vaccine (Adenovirus-vectored vaccine given as nasal drops; uses a chimpanzee adenovirus vector expressing stabilized spike protein, delivered to nasal mucosa) | As a new type of vaccine, traditional efficacy percentages from Phase III trials have not been widely published in the same way. Immunogenicity data showed that iNCOVACC induces strong mucosal IgA immunity and systemic neutralizing antibodies when used as a booster. In early trials, it successfully produced an immune response comparable to existing vaccines. It is approved as a heterologous booster based on data that it enhances immunity in those previously vaccinated with other vaccines. While precise efficacy against infection isn’t publicly quantified, it’s expected to provide an extra layer of protection in the upper respiratory tract. Its efficacy in preventing severe disease would rely on priming from prior doses – as a booster it’s aimed at broadening immunity. Research is ongoing; preliminary results indicate it can reduce the risk of symptomatic infection when given after two doses of intramuscular vaccine, but we await real-world effectiveness figures. | Side effects are generally minimal and mostly local: some recipients experience a mild runny or stuffy nose, sneezing, or slight nasal irritation for a short period after the drops. Because it’s needle-free, it avoids injection-site reactions entirely. In trials, iNCOVACC demonstrated a good safety profile – no serious vaccine-related adverse events reported. Recipients did not report significant systemic reactions (like fever or fatigue) at rates higher than placebo. The ease of administration (just drops in the nose) also improves its tolerability. Being an intranasal live-vectored vaccine, monitoring is in place for any rare allergic reactions or if it could trigger reactive airway issues, but so far it’s been very well tolerated. | Approved under India’s emergency use authorization as the world’s first intranasal COVID-19 vaccine. Initially approved in late 2022 as a booster dose for adults who had previously received other vaccines. It has since been incorporated as an option in India’s vaccination program (marketed as iNCOVACC). Regulatory approval is currently limited to India, although Bharat Biotech has been in talks with other countries and the WHO. As of 2025, it has not yet received WHO EUL or approvals in Western countries. India’s regulators vetted its Phase III data for safety and immunogenicity to authorize it. Ongoing trials aim at full licensure and at potential use as a primary series vaccine as well. | In 2025, iNCOVACC is being deployed in India as a booster, particularly to entice people who are needle-phobic or to add an extra layer of mucosal immunity for those who had two traditional shots already. Adoption has been modest but growing, given that it’s a newer concept. Bharat Biotech scaled up production and the vaccine is available via India’s national Co-WIN portal for adults as a booster choice. There is interest internationally: some countries in Asia and Africa have expressed intent to evaluate or import the intranasal vaccine pending local approval. Scientific interest is high because intranasal vaccines could theoretically reduce transmission by building immunity right where the virus enters. Bharat Biotech is also studying iNCOVACC as a primary 2-dose vaccine (not just a booster) to see if it can confer strong protection on its own. Furthermore, combination approaches (one intramuscular dose and one intranasal dose) are being researched. As of now, iNCOVACC is on the cutting edge of next-generation vaccine strategy – one of the few licensed mucosal vaccines for COVID-19. It symbolizes the kind of innovation aimed at improving convenience and broadening immune protection as we move into the next phase of vaccination campaigns. |
Written on June 1, 2025
Table of Contents
- Mathematical Foundation
- Code and Step-by-Step Explanation
- 0. Install and Load Required Packages
- 1. Define Output Directory
- 2. Data Import and Preparation
- 3. Exploratory Data Analysis (EDA)
- 4. Kaplan-Meier Plots Stratified by Site
- 5. Cox Proportional Hazards Model Including Site
- 6. Checking for Multicollinearity
- 7. Penalized Cox Regression (Lasso)
- 8. Fine-Gray Model for Competing Risks
- 9. Visualizing Survival Differences Across Sites
- 10. Additional Recommendations and Checks
- 11. End of Script
The Kaplan-Meier (KM) estimator is a non-parametric statistic used to estimate the survival function \( S(t) \). If there are \( n \) subjects, let \( t_{(1)}, t_{(2)}, \dots, t_{(k)} \) be the distinct event times in ascending order. Let \( d_j \) be the number of events (clearances) at time \( t_{(j)} \) and \( r_j \) be the number of subjects at risk just before \( t_{(j)} \). The KM estimator is:
\[ \hat{S}(t) = \prod_{t_{(j)} \le t} \left( 1 - \frac{d_j}{r_j} \right). \]It allows comparing the time to clearance across different sites (e.g., stool vs. urine).
The Cox Proportional Hazards model expresses the hazard function for individual \( i \) as:
\[ \lambda_i(t) = \lambda_0(t) \exp\left( \boldsymbol{\beta}^\top \mathbf{x}_i \right), \]where
A hazard ratio between two covariate levels reflects the ratio of their hazards at any given time \( t \). If a coefficient \( \beta_j \) is positive, it suggests an increased rate of clearance for that factor level.
When multiple competing events can occur (e.g., clearance is the event of interest, but death or loss to follow-up might preclude clearance), a competing risk model such as Fine and Gray is used:
\[ \text{Fine-Gray: } \quad \text{Subdistribution hazard } = h_j(t) = \lim_{\Delta t \to 0} \frac{P(t \le T < t + \Delta t, \epsilon = j \mid T \ge t \text{ or } (T < t \text{ and } \epsilon \neq j))}{\Delta t}, \]where \( \epsilon \) indicates the type of event (\( j = \) clearance, other = competing events). This method accounts for the fact that once a competing event (e.g., death) happens, one can no longer clear the pathogen.
Step | Section | Purpose |
---|---|---|
0 | Install Packages | Ensures necessary packages are installed |
1 | Define Directory | Output directory for storing analysis results |
2 | Data Import/Prep | Loads and cleans the data |
3 | Exploratory Analysis | Summaries and missing data checks |
4 | Kaplan-Meier Analysis | Non-parametric survival estimates by site |
5 | Cox Model (Simple) | Estimates hazard ratios for clearance with Site only |
6 | Multicollinearity Check | Detects correlated predictors using VIF |
7 | Penalized Cox (Lasso) | Regularization to handle many or correlated predictors |
8 | Fine-Gray Model | Competing risks approach for clearance vs. death |
9 | Visualization | Visual compares survival curves by site |
10 | Recommendations/Checks | Influential points, outliers, and missing data |
11 | End of Script | Wrap-up, optional housekeeping |
Note: The code presented here is R-based and uses packages such as survival, survminer, cmprsk, dplyr, ggplot2, and glmnet. Adjust package calls as needed for local environment.
# -------------------------------------------
# Survival Analysis on Quarantine Clearance
# Comparing Site-Specific Clearance: Stool, Urine, Sputum, Blood
# © 2024 Hyunsuk Frank Roh, MD. All rights reserved.
# -------------------------------------------
# 0. Setup: Install and Load Required Packages
install_and_load <- function(package) {
if (!require(package, character.only = TRUE)) {
install.packages(package, dependencies = TRUE)
library(package, character.only = TRUE)
}
}
required_packages <- c("survival", "survminer", "cmprsk", "dplyr", "readr",
"ggplot2", "patchwork", "forcats", "reshape2", "glmnet", "car")
# Load dplyr early to avoid conflicts
suppressMessages(library(dplyr))
invisible(lapply(required_packages, install_and_load))
Interpretation: - This section ensures that all required packages for reading data, plotting, modeling, and statistical testing are present and loaded. - Automated checks help avoid missing dependencies during runtime.
# 1. Define Output Directory
desktop_path <- file.path(Sys.getenv("USERPROFILE"), "Desktop")
output_dir <- file.path(desktop_path, "Survival_Analysis_Images")
if (!dir.exists(output_dir)) {
dir.create(output_dir, recursive = TRUE)
message(paste("Created directory:", output_dir))
} else {
message(paste("Directory already exists:", output_dir))
}
Interpretation: - Specifies where plots and model summaries will be stored. - Recursive creation of directories ensures sub-directories exist when writing outputs.
# 2. Data Import and Preparation
data_path <- file.path(desktop_path, "dataset05.csv")
my_data <- read_csv(data_path, locale = locale(encoding = "UTF-8"))
cat("First few rows of the dataset:\n")
print(head(my_data))
cat("Structure of the dataset:\n")
print(str(my_data))
# Convert variables to appropriate data types
my_data <- my_data %>%
mutate(
PatientID = as.factor(PatientID),
Pathogen = as.factor(Pathogen),
Site = as.factor(Site),
Start_Date = as.Date(Start_Date, format = "%m/%d/%Y"),
Event_Date = as.Date(Event_Date, format = "%m/%d/%Y"),
Event_Type = as.factor(Event_Type),
Event_Code = as.numeric(Event_Code),
Age = as.numeric(Age),
Gender = as.factor(Gender),
FactorA = as.factor(FactorA),
FactorB = as.factor(FactorB),
FactorC = factor(FactorC, levels = c("Low", "Medium", "High"), ordered = TRUE)
)
cat("Structure after type conversions:\n")
print(str(my_data))
Interpretation:
- Reads the CSV file containing survival (clearance) data and sets correct data types for key variables.
- Event_Code should indicate whether an event (clearance) occurred (e.g., 1 = clearance
, 0 = censored
, 2 = competing event
if used).
- This step ensures subsequent survival analysis is consistent with R’s expectations for times, events, and factors.
# 3. Exploratory Data Analysis (EDA)
cat("Summary statistics:\n")
print(summary(my_data))
cat("Missing values per column:\n")
print(sapply(my_data, function(x) sum(is.na(x))))
cat("Event distribution across sites:\n")
print(table(my_data$Site, my_data$Event_Code))
Interpretation: - Summary statistics and missing value checks identify potential data issues. - A two-way table with Site vs. Event_Code reveals how many clearance events occurred at each site.
# 4. Kaplan-Meier Plots Stratified by Site
if (!"Time_to_Event" %in% colnames(my_data)) {
my_data <- my_data %>%
mutate(Time_to_Event = as.numeric(difftime(Event_Date, Start_Date, units = "days")))
}
if (any(my_data$Time_to_Event <= 0, na.rm = TRUE)) {
cat("There are observations with non-positive Time_to_Event. These will be removed.\n")
my_data <- my_data %>%
filter(Time_to_Event > 0)
}
# Create the survival object
surv_object_site <- Surv(time = my_data$Time_to_Event, event = my_data$Event_Code == 1)
# Fit Kaplan-Meier survival curves stratified by Site
km_fit_site <- survfit(surv_object_site ~ Site, data = my_data)
# Plot the Kaplan-Meier curves
km_plot <- ggsurvplot(
km_fit_site,
data = my_data,
risk.table = TRUE,
pval = TRUE, # Adds p-value from log-rank test
conf.int = TRUE,
xlab = "Time in Days",
ylab = "Probability of Clearance",
title = "Kaplan-Meier Curves Stratified by Site",
legend.title = "Site",
palette = scales::hue_pal()(length(levels(my_data$Site)))
)
print(km_plot)
ggsave(
filename = file.path(output_dir, "Kaplan_Meier_Curves_Stratified_by_Site.png"),
plot = km_plot$plot, width = 8, height = 6
)
ggsave(
filename = file.path(output_dir, "Kaplan_Meier_Risk_Table_Stratified_by_Site.png"),
plot = km_plot$table, width = 8, height = 3
)
Mathematical Note: - The Kaplan-Meier approach estimates \( \hat{S}(t) \) (the probability that clearance has not yet occurred by time \( t \)). - Stratification by Site compares different clearance curves. A log-rank test checks if the curves are significantly different.
Interpretation: - A steeper curve indicates faster clearance. - The p-value clarifies whether differences in clearance times across sites are statistically significant.
# 5. Cox Proportional Hazards Model Including Site
cox_model_simple <- coxph(surv_object_site ~ Site, data = my_data)
cat("Summary of Simplified Cox Model (Only Site):\n")
summary_cox_simple <- summary(cox_model_simple)
print(summary_cox_simple)
sink(file = file.path(output_dir, "Summary_Simplified_Cox_Model.txt"))
print(summary_cox_simple)
sink()
# Test proportional hazards assumption
ph_test_simple <- tryCatch(
cox.zph(cox_model_simple),
error = function(e) {
cat("Error in proportional hazards test:\n", e$message, "\n")
return(NULL)
}
)
if (!is.null(ph_test_simple)) {
cat("Proportional Hazards Test for Simplified Model:\n")
print(ph_test_simple)
sink(file = file.path(output_dir, "PH_Test_Simplified_Cox_Model.txt"))
print(ph_test_simple)
sink()
ph_plot_simple <- ggcoxzph(ph_test_simple)
print(ph_plot_simple)
ggsave(
filename = file.path(output_dir, "Schoenfeld_Residuals_Simplified_Cox_Model.png"),
plot = ph_plot_simple, width = 8, height = 6
)
} else {
cat("Proportional hazards assumption test could not be performed.\n")
}
Mathematical Note: - The Cox model: \( \lambda_i(t) = \lambda_0(t) \exp(\beta_1 X_{i1} + \cdots + \beta_p X_{ip}) \). - Here, the predictor is only Site (categorical). The exponent of a coefficient, \( \exp(\hat{\beta}) \), is the hazard ratio. - If \( \exp(\hat{\beta}) > 1 \), that site experiences a faster clearance (on average). - If \( \exp(\hat{\beta}) < 1 \), that site has a slower clearance rate compared to the reference site.
Interpretation: - Cox.ZPH test checks whether proportional hazards assumption is valid. - If non-significant, it suggests no major violation of the assumption for that predictor.
# 6. Checking for Multicollinearity
cox_model_full <- coxph(surv_object_site ~ Site + FactorA + FactorB + FactorC + Age + Gender, data = my_data)
print("Summary of Full Cox Model:")
summary_cox_full <- summary(cox_model_full)
print(summary_cox_full)
sink(file = file.path(output_dir, "Summary_Full_Cox_Model.txt"))
print(summary_cox_full)
sink()
# Compute VIF
if(!is.null(cox_model_full$coefficients)) {
design_matrix <- model.matrix(~ Site + FactorA + FactorB + FactorC + Age + Gender, data = my_data)[, -1]
lm_fit <- lm(Time_to_Event ~ ., data = as.data.frame(design_matrix))
vif_values <- vif(lm_fit)
print("Variance Inflation Factor (VIF) for Predictors:")
print(vif_values)
write.csv(as.data.frame(vif_values), file = file.path(output_dir, "VIF_Full_Cox_Model.csv"), row.names = TRUE)
} else {
warning("Full Cox model did not converge. Skipping VIF computation.")
}
Mathematical Note: - VIF (Variance Inflation Factor) for each predictor \( X_j \) is: \[ \mathrm{VIF}_j = \frac{1}{1 - R_j^2}, \] where \( R_j^2 \) is the coefficient of determination when \( X_j \) is regressed on all other predictors. - High VIF (commonly > 5 or 10) indicates multicollinearity.
Interpretation: - Identifying collinear factors ensures stable model estimates. - High collinearity in a Cox model can lead to large standard errors for coefficient estimates.
# 7. Penalized Cox Regression (Lasso)
my_data_pen <- my_data %>%
select(Time_to_Event, Event_Code, Site, FactorA, FactorB, FactorC, Age, Gender) %>%
drop_na()
surv_object_pen <- Surv(time = my_data_pen$Time_to_Event, event = my_data_pen$Event_Code == 1)
x_pen <- model.matrix(~ Site + FactorA + FactorB + FactorC + Age + Gender, data = my_data_pen)[, -1]
y_pen <- surv_object_pen
set.seed(123)
cv_fit <- cv.glmnet(x_pen, y_pen, family = "cox", alpha = 1, standardize = TRUE)
cv_plot <- plot(cv_fit)
title("Cross-Validation for Penalized Cox Regression", line = 2.5)
print(cv_plot)
ggsave(filename = file.path(output_dir, "Penalized_Cox_CV_Plot.png"),
plot = cv_plot, width = 8, height = 6)
best_lambda <- cv_fit$lambda.min
print(paste("Best lambda selected by cross-validation:", best_lambda))
penalized_cox <- glmnet(x_pen, y_pen, family = "cox", alpha = 1, lambda = best_lambda, standardize = TRUE)
print("Coefficients from Penalized Cox Model:")
penalized_cox_coef <- coef(penalized_cox)
print(penalized_cox_coef)
write.csv(as.data.frame(as.matrix(penalized_cox_coef)),
file = file.path(output_dir, "Penalized_Cox_Model_Coefficients.csv"),
row.names = TRUE)
Mathematical Note: - Penalized Cox adds an \( L_1 \) penalty term \( \alpha \lambda \|\beta\|_1 \) (when \( \alpha=1 \)) to the partial likelihood objective. This is known as Lasso, encouraging sparse solutions (some coefficients shrink to 0).
Interpretation: - Lasso helps in variable selection and reducing overfitting, especially when many predictors might be collinear or the sample size is small.
# 8. Fine-Gray Model for Competing Risks
# Event_Code == 1: Clearance (event of interest)
# Event_Code == 2: Some competing event (e.g., Death)
my_data_fg <- my_data %>%
mutate(
Competing_Event = case_when(
Event_Code == 1 ~ 0, # Event of interest
Event_Code == 2 ~ 1, # Competing event
TRUE ~ NA_real_
)
)
na_count_fg <- sum(is.na(my_data_fg$Competing_Event))
print(paste("Number of NA values in Competing_Event:", na_count_fg))
write.csv(data.frame(Column = "Competing_Event", NA_Count = na_count_fg),
file = file.path(output_dir, "NA_Counts_Competing_Event.csv"),
row.names = FALSE)
my_data_clean_fg <- my_data_fg %>%
filter(!is.na(Competing_Event)) %>%
drop_na(Site, FactorA, FactorB, FactorC, Age, Gender, Time_to_Event, Competing_Event)
print("Dimensions after cleaning for Fine-Gray model:")
print(dim(my_data_clean_fg))
write.csv(data.frame(Dimensions = paste(dim(my_data_clean_fg), collapse = "x")),
file.path(output_dir, "Fine_Gray_Clean_Data_Dimensions.csv"),
row.names = FALSE)
covariates_fg <- model.matrix(~ Site + FactorA + FactorB + FactorC + Age + Gender, data = my_data_clean_fg)[, -1]
print(paste("Number of covariates:", ncol(covariates_fg)))
print(paste("Number of observations:", nrow(my_data_clean_fg)))
write.csv(data.frame(Covariates = ncol(covariates_fg), Observations = nrow(my_data_clean_fg)),
file.path(output_dir, "Covariate_Matrix_Dimensions_Fine_Gray.csv"),
row.names = FALSE)
if(nrow(covariates_fg) == nrow(my_data_clean_fg)) {
fg_model_site <- try(crr(
ftime = my_data_clean_fg$Time_to_Event,
fstatus = my_data_clean_fg$Competing_Event,
cov1 = covariates_fg
), silent = TRUE)
if(class(fg_model_site) != "try-error") {
print("Summary of Fine-Gray Model:")
summary_fg <- summary(fg_model_site)
print(summary_fg)
sink(file = file.path(output_dir, "Summary_Fine_Gray_Model.txt"))
print(summary_fg)
sink()
} else {
warning("Fine-Gray model failed to converge. Check data for issues.")
error_message <- "Fine-Gray model failed to converge. Check data for issues."
write.csv(data.frame(Error = error_message),
file.path(output_dir, "Fine_Gray_Model_Error.csv"),
row.names = FALSE)
}
} else {
stop("Mismatch in the number of rows between covariates and survival data for Fine-Gray model.")
}
Mathematical Note: - Fine and Gray (1999) proposed modeling the subdistribution hazard of a particular event type in the presence of competing events. - The subdistribution hazard function is different from the cause-specific hazard: it explicitly accounts for individuals who have experienced the competing event but remain “at risk” in the subdistribution sense.
Interpretation: - Use this approach if competing events (like death) might preclude observing clearance. - The model yields subdistribution hazard ratios, interpreted as the effect of covariates on the cumulative incidence of clearance.
# 9. Visualizing Survival Differences Across Sites
events_per_site <- my_data %>%
group_by(Site) %>%
summarize(Events = sum(Event_Code == 1, na.rm = TRUE))
print("Number of events per site:")
print(events_per_site)
write.csv(events_per_site, file.path(output_dir, "Events_Per_Site.csv"), row.names = FALSE)
valid_sites <- events_per_site %>%
filter(Events > 0) %>%
pull(Site)
my_data_valid <- my_data %>%
filter(Site %in% valid_sites)
surv_object_site_valid <- Surv(time = my_data_valid$Time_to_Event, event = my_data_valid$Event_Code == 1)
km_fit_site_valid <- survfit(surv_object_site_valid ~ Site, data = my_data_valid)
km_plot_valid <- ggsurvplot(
km_fit_site_valid,
data = my_data_valid,
risk.table = TRUE,
pval = TRUE,
conf.int = TRUE,
xlab = "Time in Days",
ylab = "Survival Probability",
title = "Survival Curves by Site",
legend.title = "Site",
legend.labs = levels(my_data_valid$Site),
palette = c("#E7B800", "#2E9FDF", "#FC4E07", "#00BA38"),
ggtheme = theme_minimal()
)
print(km_plot_valid)
ggsave(filename = file.path(output_dir, "Kaplan_Meier_Curves_By_Site_Valid_Sites.png"),
plot = km_plot_valid$plot, width = 8, height = 6)
ggsave(filename = file.path(output_dir, "Kaplan_Meier_Risk_Table_By_Site_Valid_Sites.png"),
plot = km_plot_valid$table, width = 8, height = 3)
obs_per_site <- my_data_valid %>%
group_by(Site) %>%
summarize(Count = n())
print("Number of observations per valid site:")
print(obs_per_site)
write.csv(obs_per_site, file.path(output_dir, "Observations_Per_Valid_Site.csv"), row.names = FALSE)
if(all(obs_per_site$Count >= 2)) {
km_facet_plot <- ggsurvplot_facet(
km_fit_site_valid,
data = my_data_valid,
facet.by = "Site",
nrow = 2,
ncol = 2,
risk.table = TRUE,
pval = FALSE,
conf.int = TRUE,
xlab = "Time in Days",
ylab = "Survival Probability",
title = "Faceted Kaplan-Meier Curves by Site",
ggtheme = theme_minimal()
)
print(km_facet_plot)
ggsave(filename = file.path(output_dir, "Faceted_Kaplan_Meier_Curves_By_Site.png"),
plot = km_facet_plot$plot, width = 12, height = 8)
ggsave(filename = file.path(output_dir, "Faceted_Kaplan_Meier_Risk_Table_By_Site.png"),
plot = km_facet_plot$table, width = 12, height = 6)
} else {
warning("Not all sites have enough observations for faceted plots. Creating separate plots for each site.")
library(patchwork)
plots <- list()
risk_tables <- list()
for (site in levels(my_data_valid$Site)) {
fit_site <- survfit(Surv(Time_to_Event, Event_Code == 1) ~ 1, data = my_data_valid %>% filter(Site == site))
p <- ggsurvplot(
fit_site,
data = my_data_valid,
risk.table = TRUE,
pval = FALSE,
conf.int = TRUE,
xlab = "Time in Days",
ylab = "Survival Probability",
title = paste("Survival Curve for", site),
ggtheme = theme_minimal()
)
plots[[site]] <- p$plot
risk_tables[[site]] <- p$table
}
combined_plot <- wrap_plots(plots, ncol = 2)
print(combined_plot)
ggsave(filename = file.path(output_dir, "Combined_Kaplan_Meier_Curves_By_Site.png"),
plot = combined_plot, width = 12, height = 8)
for (site in levels(my_data_valid$Site)) {
ggsave(filename = file.path(output_dir, paste0("Risk_Table_", site, ".png")),
plot = risk_tables[[site]], width = 8, height = 3)
}
}
Interpretation: - Faceted or combined survival plots highlight clearance patterns for each site. - This step is crucial for visual diagnosis of clearance timelines across multiple categories.
# 10. Additional Recommendations and Checks
# 10.1. Check for Influential Observations
influence_cox <- residuals(cox_model_simple, type = "dfbeta")
if(!is.null(influence_cox)) {
dfbeta_df <- as.data.frame(influence_cox)
dfbeta_df$PatientID <- rownames(dfbeta_df)
library(reshape2)
dfbeta_melt <- melt(dfbeta_df, id.vars = "PatientID")
dfbeta_plot <- ggplot(dfbeta_melt, aes(x = PatientID, y = value, color = variable)) +
geom_point() +
geom_hline(yintercept = 0, linetype = "dashed") +
labs(title = "DFBeta for Simplified Cox Model", x = "Patient ID", y = "DFBeta") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5))
print(dfbeta_plot)
ggsave(filename = file.path(output_dir, "DFBeta_Simplified_Cox_Model.png"),
plot = dfbeta_plot, width = 12, height = 6)
} else {
warning("No dfbeta available for the simplified Cox model.")
}
# 10.2. Assess Outliers in Continuous Variables
age_boxplot <- ggplot(my_data, aes(x = "", y = Age)) +
geom_boxplot(fill = "#2E9FDF") +
labs(title = "Boxplot of Age", y = "Age") +
theme_minimal()
print(age_boxplot)
ggsave(filename = file.path(output_dir, "Boxplot_Age.png"),
plot = age_boxplot, width = 6, height = 6)
# 10.3. Handling Missing Data
# Consider advanced imputation if missingness is not negligible.
Interpretation: - DFBeta plots: Identifies influential points that might unduly affect estimates. - Boxplot of Age: Quickly spot outliers or unusual distributions.
# 11. End of Script
# Most plots and model summaries have already been saved in previous steps.
# Additional saves can be added if needed.
Written on December 24th, 2024