tech:

taffy

Semi-supervised learning

Semi-supervised learning is a machine learning paradigm that combines labeled and unlabeled data to build predictive models.

In traditional supervised learning, models are trained using labeled data where each instance is associated with a known target or class label. Unsupervised learning, on the other hand, deals with unlabeled data, aiming to discover patterns or structure in the data without explicit target labels.

Semi-supervised learning bridges the gap between these two approaches by utilizing both labeled and unlabeled data during model training.

The motivation behind semi-supervised learning is that labeled data is often scarce or expensive to obtain, while unlabeled data is more abundant and easily accessible. By leveraging the additional unlabeled data, semi-supervised learning aims to improve the model’s performance and generalization compared to using labeled data alone.

There are various approaches to semi-supervised learning:

  1. Self-training: In self-training, the model is first trained using the labeled data, and then it is used to make predictions on the unlabeled data. The predictions are considered as pseudo-labels for the unlabeled data, and the model is further trained using this combined labeled and pseudo-labeled data.
  2. Co-training: Co-training involves training multiple models on different subsets or views of the data. Each model learns from the labeled data and uses its predictions on the unlabeled data to generate additional training examples for the other models. This approach assumes that different views or perspectives of the data provide complementary information.
  3. Generative models: Generative models, such as generative adversarial networks (GANs) or variational autoencoders (VAEs), can be utilized in semi-supervised learning. These models learn the underlying distribution of the data and can generate additional synthetic examples that resemble the unlabeled data. These generated examples can then be combined with the labeled data for training.

Semi-supervised learning has applications in various domains where labeled data is limited but unlabeled data is abundant. It has been successful in tasks such as document classification, speech recognition, image classification, and anomaly detection.

Semi-supervised learning faces challenges, such as the quality and reliability of the pseudo-labels generated from unlabeled data. It requires careful handling of unlabeled data to ensure that the model does not propagate errors or uncertainties from unreliable predictions.


 

Just in

Tembo raises $14M

Cincinnati, Ohio-based Tembo, a Postgres managed service provider, has raised $14 million in a Series A funding round.

Raspberry Pi is now a public company — TC

Raspberry Pi priced its IPO on the London Stock Exchange on Tuesday morning at £2.80 per share, valuing it at £542 million, or $690 million at today’s exchange rate, writes Romain Dillet. 

AlphaSense raises $650M

AlphaSense, a market intelligence and search platform, has raised $650 million in funding, co-led by Viking Global Investors and BDT & MSD Partners.

Elon Musk’s xAI raises $6B to take on OpenAI — VentureBeat

Confirming reports from April, the series B investment comes from the participation of multiple known venture capital firms and investors, including Valor Equity Partners, Vy Capital, Andreessen Horowitz (A16z), Sequoia Capital, Fidelity Management & Research Company, Prince Alwaleed Bin Talal and Kingdom Holding, writes Shubham Sharma. 

Capgemini partners with DARPA to explore quantum computing for carbon capture

Capgemini Government Solutions has launched a new initiative with the Defense Advanced Research Projects Agency (DARPA) to investigate quantum computing's potential in carbon capture.