Search Results for author: Mislav Balunović

Found 15 papers, 11 papers with code

Large Language Models are Advanced Anonymizers

no code implementations21 Feb 2024 Robin Staab, Mark Vero, Mislav Balunović, Martin Vechev

Recent work in privacy research on large language models has shown that they achieve near human-level performance at inferring personal data from real-world online texts.

Text Anonymization

From Principle to Practice: Vertical Data Minimization for Machine Learning

1 code implementation17 Nov 2023 Robin Staab, Nikola Jovanović, Mislav Balunović, Martin Vechev

We propose a novel vertical DM (vDM) workflow based on data generalization, which by design ensures that no full-resolution client data is collected during training and deployment of models, benefiting client privacy by reducing the attack surface in case of a breach.

Beyond Memorization: Violating Privacy Via Inference with Large Language Models

no code implementations11 Oct 2023 Robin Staab, Mark Vero, Mislav Balunović, Martin Vechev

In this work, we present the first comprehensive study on the capabilities of pretrained LLMs to infer personal attributes from text.

Memorization Text Anonymization

CuTS: Customizable Tabular Synthetic Data Generation

no code implementations7 Jul 2023 Mark Vero, Mislav Balunović, Martin Vechev

To ensure high synthetic data quality in the presence of custom specifications, CuTS is pre-trained on the original dataset and fine-tuned on a differentiable loss automatically derived from the provided specifications using novel relaxations.

Fairness Synthetic Data Generation

FARE: Provably Fair Representation Learning with Practical Certificates

1 code implementation13 Oct 2022 Nikola Jovanović, Mislav Balunović, Dimitar I. Dimitrov, Martin Vechev

To produce a practical certificate, we develop and apply a statistical procedure that computes a finite sample high-confidence upper bound on the unfairness of any downstream classifier trained on FARE embeddings.

Fairness Representation Learning

TabLeak: Tabular Data Leakage in Federated Learning

1 code implementation4 Oct 2022 Mark Vero, Mislav Balunović, Dimitar I. Dimitrov, Martin Vechev

A successful attack for tabular data must address two key challenges unique to the domain: (i) obtaining a solution to a high-variance mixed discrete-continuous optimization problem, and (ii) enabling human assessment of the reconstruction as unlike for image and text data, direct human inspection is not possible.

Federated Learning Reconstruction Attack +1

Data Leakage in Federated Averaging

1 code implementation24 Jun 2022 Dimitar I. Dimitrov, Mislav Balunović, Nikola Konstantinov, Martin Vechev

On the popular FEMNIST dataset, we demonstrate that on average we successfully recover >45% of the client's images from realistic FedAvg updates computed on 10 local epochs of 10 batches each with 5 images, compared to only <10% using the baseline.

Federated Learning

LAMP: Extracting Text from Gradients with Language Model Priors

2 code implementations17 Feb 2022 Mislav Balunović, Dimitar I. Dimitrov, Nikola Jovanović, Martin Vechev

Recent work shows that sensitive user data can be reconstructed from gradient updates, breaking the key privacy promise of federated learning.

Federated Learning Language Modelling

Latent Space Smoothing for Individually Fair Representations

1 code implementation26 Nov 2021 Momchil Peychev, Anian Ruoss, Mislav Balunović, Maximilian Baader, Martin Vechev

This enables us to learn individually fair representations that map similar individuals close together by using adversarial training to minimize the distance between their representations.

Fairness Representation Learning

Bayesian Framework for Gradient Leakage

2 code implementations ICLR 2022 Mislav Balunović, Dimitar I. Dimitrov, Robin Staab, Martin Vechev

We demonstrate that existing leakage attacks can be seen as approximations of this optimal adversary with different assumptions on the probability distributions of the input data and gradients.

Federated Learning

Fair Normalizing Flows

1 code implementation ICLR 2022 Mislav Balunović, Anian Ruoss, Martin Vechev

Fair representation learning is an attractive approach that promises fairness of downstream predictors by encoding sensitive data.

Fairness Representation Learning +1

Robustness Certification for Point Cloud Models

1 code implementation ICCV 2021 Tobias Lorenz, Anian Ruoss, Mislav Balunović, Gagandeep Singh, Martin Vechev

In this work, we address this challenge and introduce 3DCertify, the first verifier able to certify the robustness of point cloud models.

On the Paradox of Certified Training

no code implementations12 Feb 2021 Nikola Jovanović, Mislav Balunović, Maximilian Baader, Martin Vechev

Certified defenses based on convex relaxations are an established technique for training provably robust models.

Efficient Certification of Spatial Robustness

1 code implementation19 Sep 2020 Anian Ruoss, Maximilian Baader, Mislav Balunović, Martin Vechev

Recent work has exposed the vulnerability of computer vision models to vector field attacks.

Learning Certified Individually Fair Representations

1 code implementation NeurIPS 2020 Anian Ruoss, Mislav Balunović, Marc Fischer, Martin Vechev

That is, our method enables the data producer to learn and certify a representation where for a data point all similar individuals are at $\ell_\infty$-distance at most $\epsilon$, thus allowing data consumers to certify individual fairness by proving $\epsilon$-robustness of their classifier.

Fairness Representation Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.