Skip to content

Php |verified| — Detect Encoding

function smartEncodingDetect(string $string, array $priorities = ['UTF-8', 'ISO-8859-1', 'Windows-1252']) foreach ($priorities as $encoding) // For UTF-8, validate it strictly if ($encoding === 'UTF-8' && mb_check_encoding($string, 'UTF-8')) return 'UTF-8'; // For others, attempt detection if (mb_detect_encoding($string, $encoding, true) === $encoding) return $encoding; return 'UTF-8'; // safe fallback

We’ve all been there. You import a CSV from a client, scrape a legacy website, or process an old text file, and suddenly your output looks like é instead of é . Garbage characters. Mojibake. detect encoding php

// Wrong approach for text encoding: $finfo = finfo_open(FILEINFO_MIME_ENCODING); echo finfo_file($finfo, 'file.txt'); // "us-ascii" or "utf-8" (unreliable) // Better: read content and detect $content = file_get_contents('file.txt'); echo mb_detect_encoding($content); Mojibake

$string = "Café"; $encoding = mb_detect_encoding($string); echo $encoding; // UTF-8 (usually) By default, it looks for . You can pass a custom list of encodings: $string = "Café"

// Double-check UTF-8 validity if ($detected === 'UTF-8' && !mb_check_encoding($string, 'UTF-8')) return 'Windows-1252'; // common fallback